Mon. Not. R. Astron. Soc. 000, [TH23] (2012) Printed 11 December 2012 (MN I^TeX style file v2. 2) 



Simulations of the galaxy population constrained by 
observations from 2; = 3 to the present day: implications for 
galactic winds and the fate of their ejecta 

Bruno M. B. Henriques^*, Simon D. M. White\ Peter A. Thomas^, 
Raul E. Angulo^, Qi Guo'^, Gerard Lemson\ Volker Springel^'^ 

^ Max- Planck- Institut fiir A strophysik, Karl-Schwarzs child- Str. 1, 85741 Garching b. Munchen, Germany 
^Astronomy Centre, University of Sussex, Palmer, Brighton BNl 9QH, United Kingdom 

^ Kavli Institute for Particle Astrophysics and Cosmology, Stanford University, Menlo Park, CA 94025, USA 
Partner Group of the Max- Planck- Institut fiir Astrophysik, National Astronomical Observatories, Chinese Academy of Sciences, 
Beijing, 100012, China 

^ Heidelberger Institut fiir Theoretische Studien, Schloss-Wolfsbrunnenweg 35, 69118 Heidelberg, Germany 
^ Zentrum fiir Astronomic der Universitdt Heidelberg, ARI, Monchhofstr. 12-14, 69120 Heidelberg, Germany 



Submitted to MNRAS 



ABSTRACT 

We apply Monte Carlo Markov Chain (MCMC) methods to large-scale simulations 
of galaxy formation in a ACDM cosmology in order to explore how star formation 
and feedback are constrained by the observed luminosity and stellar mass functions of 
galaxies. We build models jointly on the Millennium and Millennium-II simulations, 
applying fast sampling techniques which allow observed galaxy abundances over the 
ranges 7 < logAf^/M© < 12 and ^ z ^ 3 to be used simultaneously as constraints 
in the MCMC analysis. When z = constraints alone are imposed, we reproduce the 
results of previous modelling bv lGuo et al.l (|2012f ). but no single set of parameters can 
reproduce observed galaxy abundances at all redshifts simultaneously, reflecting the 
fact that low-mass galaxies form too early and thus are overabundant at high redshift 
in this model. The data require the efficiency with which galactic wind ejecta are reac- 
creted to vary with redshift and halo mass quite differently than previously assumed, 
but in a similar way as in some recent hydrodynamic simulations of galaxy formation. 
We propose a specific model in which reincorporation timescales vary inversely with 
halo mass and are independent of redshift. This produces an evolving galaxy popu- 
lation which fits observed abundances as a function of stellar mass, B- and i^-band 
luminosity at all redshifts simultaneously. It also produces a significant improvement 
in two other areas where previous models were deficient. It leads to present day dwarf 
galaxy populations which are younger, bluer, more strongly star-forming and more 
weakly clustered on small scales than before, although the passive fraction of faint 
dwarfs remains too high. 

Key words: methods: numerical - methods: statistical ~ galaxies: formation - galax- 
ies: evolution - stars: AGB 



1 INTRODUCTION 



The semi-analytic modelling technique uses a set of 
parametrised, physically based equations to describe each 
of the astroph ysical processes that affect galaxy formation 
and evolution (|Whitelll989l : IColelll99ll : iLacev fc Silklll99ll : 
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IWhite fc Fr"en5ll99ll ). The development of the galaxy pop- 
ulation as a whole is followed by applying these recipes to 
a set of merger trees representing the hierarchical assem- 
bly of nonhnear structure in the dark matter distribution. 
Such halo merger tre es can be construc t ed effi ciently us- 
ing e xtensions of the iPress fc Schechteij lll974l) formalism 
fe.g. ICold Il99l l: 'Kauffmann et al.l Il993l : ICole et all Il994l : 
ISomerville fc Primack 1999 1 but if spatial and kinematic in- 
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formation is needed, for example, for analysis of galaxy clus- 
tering, it is more eff ective to extract them directly from nu- 



merical simulation s (Roukcma ct al. iKauffmann et alJ 

1 19991 : iHellv et al.| [2003: Hatton ct al. 2003). With increas- 
ing numerical resolution it becomes possible to base the 
trees not on the haloes themselves but rather on the in- 
dividual gravi tationally self-bound subhaloes from which 
they are built fep rmgel et "alll200ll . 12005*: 'Ka ng et"alll2005l : 
iDe Lucia ct al. 200"^) . This increases the fidelity with which 
the galaxy distribution is simulated, but introduces a de- 
pendence on numerical resolution which must be tested 
through careful conv ergence studies (e.g. ISpringel et al.l 
l200ll : iGuo et al.ll201ll ). 

The great advantage of such semi-analytic simulations is 
their decoupling of the computation of dark matter evolution 
from that of the baryonic components. The physically con- 
sistent but relatively simple models adopted for complex and 
poorly understood baryonic processes allow a wide range of 
possible descriptions to be explored. The optimal parame- 
ters for each can be determined in an acceptable amount of 
computer time, allowing the different modelling assumptions 
to be tested by detailed comparison with the relevant ob- 
servational data. At present, semi-analytic methods are the 
only technique able to simulate the evolution of the galaxy 
population on a scale and with a precision which allows de- 
tailed interpretation of modern surveys of the galaxy pop- 
ulation. The price, of course, is that they provide at best 
only very crude information about the internal structure of 
galaxies or the structure of the intergalactic medium. 

The latest v e rsion of the Munich semi-analytic model 
||Guo et al.ll201ll . |2012| . hereafter Gil) was tested and its 
parameters adjusted by comparison with low-redshift galaxy 
data, namely, stellar mass and optical luminosity functions, 
gas fractions, colours, gas-phase metallicities, bulge-to-disk 
ratios, bulge and disk sizes, and the black hole - bulge 
mass relation. Predictions were then tested by looking at 
satellite abundances, clustering, the evolution of the stel- 
lar mass function and the integrated star formation history. 
This analysis revealed a good match between model and 
data, particularly for the present-day universe, but some dis- 
crepancies remained. Most notable were an excess of lower 
mass galaxies at high r edshift (also visible in the A'-band 
luminosity function of iHenrigues et al] 1201 j ). a substan- 
tial population of red dwarfs at low redshift, where very 
few such galaxies are observed, and ex cessive sma l l-scale 
clustering of low-redshift dwarf galaxies. IGuo et al] (|2012l ) 
showed that only the last of these problems is significantly 
alleviated if a WMAP7 cosmology is adopted instead of 
the WMAPl cosmology of the original simulations. Similar 
problems are fo und in most recent semi-analytic galaxy for- 
mation models (iFontariot et al.l 20091: Cirasuqlo et al.l 2010l: 



Henriques et al.' '201l': 'Som erville et al.l l201ll : iBower et al.l 
2012; Wcinmami ct al. 201^)7 and can thus be viewed as 
generic. 

In this paper we revisit the predictions of the Gil 
semi-analytic model using the Mo nte Carlo Markov Chain 
(MC MC) methods introd u ced b y iHenrigucs et al .l (120091') 
and 
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Henriques fc Thomas! (I201C) (see Kampakog lou et al.l 
Bower et al.ll2010l : ILu et al.ll2010l for related methods) . 



proach offers a systematic and objective means to identify 
the regions of parameter space consistent with the data and 
to obtain robust best-fit parameter estimates with associ- 
ated confidence ranges. This can yield new insight into how 
the data constrain galaxy formation physics. 

We start by investigating the uniqueness of the param- 
eter choices of Gil, finding that they are indeed nearly 
optimal when representing low-redshift data. We then use 
the MCMC technique to identify the parameter regions pre- 
ferred by the observational data at other redshifts, testing 
whether they overlap with each other and with the region 
preferred by the low-redshift data. Although almost all the 
Gil parameters fall in the allowed region at all redshifts, the 
rate at which gas is reincorporated after ejection in a wind is 
required to change. Re-accretion must be less efficient than 
originally assumed at early times, and more efficient at late 
times. We use the MCMC scheme to identify a simple mod- 
ification of the treatment of this process which can explain 
the data simultaneously at all redshifts. We then test this 
prescription by comparing its predictions with a broad set 
of observations including stellar metallicities, galaxy colours, 
star formation rates and ages, clustering and the evolution 
of the stellar mass-halo mass relation. 

This work uses merger trees from two very large 
dark matter simulations, which are well adapted to study 
galaxy formation ph ysics on a variety o f scales. The Mil- 
lennium Simulation (ISpringel et al.l [20051 ) follows 10^" par- 
ticles in a cube of side 500/i~'^Mpc, implying a particle 
mass of 8.6 X 10^/i~^M(7), while th e Millennium-H Sim- 
ulation (|Bovlan-Kolchin et al.l |2009| ) uses the same num- 
ber of particles in a region a fifth the linear size, result- 
ing in 125 times better mass resolution. Combined, the two 
simulations produce a galaxy formation model with a dy- 
namic range of five orders of magnitude in stellar mass. 
The distribution of physical properties converges for galax- 
ies with lO^ '^M© < M* < 10" '^ M0. The simulations can 
be scaled from the original WMAPl cosmology to the cur- 
rently preferred WMAP 7 cosmology using the technique of 
lAngulo fc Whitel (|2010l ). Galaxy populations almost identi- 
cal to those in the original Gil model can then be obtained 
through sm all adjustments to the semi-analytic modelling 
parameters (|Guo et aLll201^ . This scaling changes the box- 
size and particle ma ss of the simulat ions, as well as shifting 
their time axis (see lCuo et al.|[2012l . for details). Below we 
work in the WMAP7 cosmology using the scaled merger 
trees. 

The physical models used here for processes such as 
cooling, star formation and feedback have been developed 
gradually wit hin the "Munich" galaxy formation model over 
many years llWhite fc Freiiklll99ll : IKauffmann et al.lll993l . 
I1999I : ISpringel et al.ll200ll . bOOsl V The current Gil version 
inc ludes a robust treatm ent of supern ova feedb ack (follow- 
ing iDe Lucia et aL 2004 rather than iCroton et al.ll20o3 . see 
also lBenson et aLlbOOa! ). It follows t he growth of black holes 
through accretion and merging (|Kauffmann fc Haehneltl 
|2000^ and the quenc hing of star-formation by AGN feedback 
( Croton et al.l I2OO6I ) . It treats separately dust extinction 



We constrain parameters using observational estimates of 
the stellar mass function and of the K- and i3-band lumi- 
nosity functions at a variety of redshifts. The MCMC ap- 



from the inter-stellar med ium and from young birth clouds 
IIDe Lucia fc BlaizotI lioOTi ) and it can predict luminosities 
over a wide wavelength range using two different stellar pop - 
ulation synthesis models (see iHenriques et al.l 1201 ll . 120121 ) . 
Gil changed several of the baryonic physics recipes in or- 
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der to improve the treatment of dwarf and satellite galaxies. 
Supernova feedback was increased in low-mass galaxies, and 
environmental effects on satellites were treated more realisti- 
cally. They also improved the tracking of angular momentum 
as material moves between the various gaseous and stellar 
components, allowing a better treatment of the sizes of disks 
and bulges. 

This paper is organized as follows. In Section O we de- 
scribe our implementation of the MCMC method. Section [3] 
applies this method to the Gil model, constraining param- 
eters independently with data at each of a set of redshifts. 
No parameter set is consistent with the data at all redshifts. 
Guided by these MCMC results, we formulate in Section [3] 
a modification of the Gil model which can fit all redshifts 
simultaneously. The predictions of this new model are com- 
pared to the constraining observations in Section [S] while 
Section [S] tests it against observations of additional prop- 
erties that were not used as constraints. In Section [71 we 
summarize our conclusions. Appendix |X] sets out the equa- 
tions that define the parameters explored using MCMC sam- 
pling, while Appendix [B] describes how we construct a rep- 
resentative sample of dark matter trees which allows us to 
obtain results rapidly for any specific semi-analytic model. 
Appendix [C] gives details of the observational data we use 
and of the uncertainties we assume for them, when using 
MCMC techniques to con strain model param eters. 

A recent preprint bv lMutch et al.l l|2012l ') used MCMC 
techniques to expl ore similar issues using the model of 
ICroton et al.l l|2006f ) applied to the Millennium Simulation 
in its original WMAPl cosmology. Because they used stel- 
lar mass functions alone, and did not consider epochs earlier 
than z = 0.83, the evolutionary problem addressed here was 
only weakly present in their observational constraints. As a 
result they came to somewhat different conclusions than we 
do in this paper. 

Throughout this work we use the lMarastonI (|2005l ') stel- 
lar population synthesis model, but we have checked that 
for all the properties considered here the lCharlot fc Bruzuall 
l|2007ll code gives very similar results. As noted above, 
we assume a WMAP7 cosmology with specific parameters 
h = 0.704, n,n = 0.272, = 0.728, n = 0.961, and 
(Tg = 0.81, using the MS an d MS-II merger trees scaled 
exactly as in iGuo et all (|2012h . 



2 MONTE CARLO MARKOV CHAINS 

Semi-analytic models simulate a large number of complex 
physical processes whose interplay shapes the distribution 
of galaxy properties. As a result, it is not straightforward 
to find the model that best fits a given set of observa- 
tions, and to understand how unique a specific prescrip- 
tion or parameter set is. This becomes even more problem- 
atic when treatments are modified to allow interpretation of 
new kinds of observation, because such modifications often 
destroy the match to existing data. In addition, when rea- 
sonable agreement appears impossible to achieve, it is hard 
to know whether this reflects a failure to find the correct 
parameter set in a high-dimensional space, or overly sim- 
plistic modelling of one or more of the processes treated, or 
the omission of an important process, or an issue with the 



parent dark matter simulation or the structure formation 
paradigm underlying it. 

Many of these problems can be mitigated by combin- 
ing constraints from multiple observations of a wide range 
of galax;y properties with proper sampling of the high- 
dimensional model parameter space. Monte Carlo Markov 
Chain (MCMC) algorithms can be used to clarify how in- 
dividual parameters influence speciflc galaxy properties, to 
obtain confldence limits for parameters, and to quantify 
the agreement between model and observation in a sta- 
tistically robust way. They we re flrst applied to the Mu- 
nich galaxy formation model bv lHenrigues et al] (|2009l ) and 
iHenriaues fc Thomas! (|2010l '). Such algorithms sample the al- 
lowed parameter regions at a rate proportional to the pos- 
terior probability of the model conditioned by the observa- 
tional constraints, making them very efficient in analysing 
high-dimensional spaces. Section 3 of iHenrigues et akl (|200£ ) 
gives a full description of the Metro p olis-Hastings al gorithm 
used here (|MetropoHs et all Il953l : iHastingj Il970l 'l which 
builds chains in which each step to a new set of parameters 
depends only on the current parameter set, not on the previ- 
ous history. Appendix [Bl gives some technical details of our 
implementation. Of particular relevance is our identiflcation 
of a subset of merger trees which allows rapid convergence to 
results which are statistically representative of the full cos- 
mological volume. With this approach, we flnd ~ 1% of the 
full data to be sufficient to reproduce the stellar mass func- 
tion of the combined Millennium and Millennium-II sim- 
ulations to the accuracy we need, speeding up the MCMC 
calculations by about a factor of 100 and allowing us to con- 
strain the model using the observed abundance of galaxies 
with stellar masses ranging from 10^ Mq to 10^^ Mq . 

2.1 Observational constraints 

A crucial aspect of semi-analytic modelling, and of MCMC 
analysis in particular, is the choice and characterisation of 
the observational data used as constraints. The allowed re- 
gions in parameter space and the ability to find a good simul- 
taneous fit to multiple data sets depend sensitively on the 
error bars assigned to the observations. Purely statistical er- 
rors are well defined and accurately stated in many observa- 
tional studies, but it is much harder to account appropriately 
for residual systematic uncertainties. Unrecognised system- 
atics can lead to apparent inconsistencies between different 
determinations of the same population property, jeopardis- 
ing a meaningful comparison with theoretical predictions. 

Different approaches to this problem hav e been adopted 
in earlier work. For example, iBower et al.l (|2010l ) tested 
their model against a single data set to which they as- 
signed a confidence region refl ecting their own assess ment 
of its systematic uncertainty. iHenrigues et al.| (|2009l ) and 
Hcnriqucs & Thomas (2010) use multiple "good" determi- 
nations of each observational property, taking the scatter 
among them (together with the quoted statistical errors) 
to indicate likely systematic uncertainties. Here we adopt a 
version of the latter approach, noting that it still involves 
arbitrary judgements based on incomplete understanding, 
so that formal levels of agreement or disagreement between 
theory and observation should be treated with caution. In 
section (5] theoretical models are compared with combined 
data sets where we have used the quoted purely statistical 
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errors and the scatter among different "good" measurements 
to make a subjective assessment of the effective uncertain- 
ties. In Appendix [C] we show the individual data sets and 
provide further details on how they were combined to pro- 
vide constraints for each property at each redshift. 

The allowed region of parameter space for a given model 
depends not only on the uncertainties assigned to each obser- 
vation, but also, of course, on the particular observational 
quantities taken as constraints. This choice should recog- 
nise both the discriminative power and the robustness of 
the data. Use of complementary types of data can increase 
the aspects of the model which are significantly constrained. 
In the analysis of this paper we use the stellar mass func- 
tion, and the K- and _B-band luminosity functions at red- 
shifts 3, 2, 1 and 0. The abundance of galaxies as a function 
of stellar mass is one of the most fundamental properties 
of the population, but it has the disadvantage that it is 
not directly observable, requiring assumptions about stel- 
lar populations and reddening. The i-band luminosity func- 
tion would be a useful surrogate, since it is insensitive to 
dust and is unaffected by emission from stars on the poorly 
understood Thermally Pulsing - Asymptotic Giant Branch 
(TP-AGB), but few observational studies are available at 
redshifts other than zero, making it hard to assess the sys- 
tematic uncertainties. We therefore use determinations of 
the stellar mass function itself. When needed, we follow 
iDommguez Sanchez et all (l201ll') in applying a c orrect ion 
of AM, = -0.1 4 to go from iBruzual fc Charlol (120031 ') to 
iMarastonI |200^ stellar populations at 2; 5^ 1. As previously 
mentioned, the combined dynamical range of the Millennium 
and Millennium-II simulations allows us to compare model 
and data directly, even for the lowest stellar masses observed 
at z = 0. 

A critical test of galaxy formation models comes from 
comparison with the observed, redshift-dependent distribu- 
tions of colour and star formation rate. We do not include 
direct observational estimates of these properties in our 
MCMC sampling, however, since colours are very sensitive 
to the details of dust modelling and observational estimates 
of star formation rates have large and redshift-dependent 
systematic uncertainties which are difficult to characterize 
due to the small number of observational studies available. 
Instead, we include colour information indirectly by supple- 
menting our stellar mass function data with estimates of 
rest-frame B- and K-hand luminosity functions. This pro- 
vides some leverage on the star formation history, dust con- 
tent, metallicity and TP-AGB emission of galaxies. 

For each property, we combine determinations based on 
wide and narrow surveys in order to obtain constraints over 
a large dynamic range. The constraints used for the stellar 
mass, i<'-band and _B-band luminosity functions at differ- 
ent redshifts are shown in Figures |4l [5] and |6] respectively. 
Appendix [Cl explains how these were obtained from various 
observational studies. The likelihood of the model given the 
adopted observational constraints on each individual prop- 
erty is defined as: 



£ QC exp -X /2, 



where x is given by: 

i=0 



0modGl(») — (l>oha{i) 
O-obs(j) 



(1) 



(2) 



and (^niodci and 0obs are respectively the logarithms of the 
number densities of galaxies in each bin as predicted by the 
model and as determined observationally. o-obs(j) is our char- 
acterisation of the observational uncertainty in (^obs(j) (see 
Appendix [Cj. The final likelihood we assign to the model is 
then the product of the likelihoods for each galaxy property, 
either at a single redshift, or combining all redshifts: 

£ = 11 (/:[SMF(z)]£[LFk(z)]/:[LFb(z)]) « exp -Xt/2, 

z 

(3) 

where Xt is the sum of the values for all the individual 
stellar mass and luminosity function^. For a single redshift, 
this approach is a formal version of that used, for example, 
by Gil. When all redshifts are combined, the galaxy for- 
mation physics are constrained in a statistically robust way 
which takes full advantage of the fact that the semi-analytic 
model links galaxy populations self-consistently across cos- 
mic time using a deterministic and physically based treat- 
ment of the evolution of individual galaxies. 



2.2 Parameters 

We allow a total of 11 parameters to vary when sampling 
the parameter space of the Gil model. These are: the star 
formation efficiency (osf); the black hole growth efficiency 
(./bh); the AGN radio mode efficiency (/cagn); three param- 
eters governing the reheating and injection of cold disk gas 
into the hot halo phase by supernovae, the gas reheating 
efficiency (e), the reheating cutoff velocity (Vrchoat) and the 
slope of the reheating dependence on Vvii (/3i); three pa- 
rameters governing the ejection of hot halo gas to an ex- 
ternal reservoir, the gas ejection efficiency {rj), the ejection 
cutoff velocity (Kject) and the slope of the ejection depen- 
dence on K'ir (/32); the efficiency of reincorporating gas from 
the external reservoir to the hot halo (7) or the gas return 
time in our new reincorporation model (7'); and the yield of 
metals returned to the gas phase by stars (zyiew). The equa- 
tions describing these processes and defining the parameters 
are described concisely in Appendix [X] The very fact that 
such a high-dimensional parameter space can be analysed 
at all in a reasonable amount of computer time emphasizes 
a key strength of the MCMC method. Using naive tech- 
niques would require computing ~ 10^^ models to survey 
the space with just 10 points along each parameter dimen- 
sion. In contrast, we achieve convergence for chains with as 
few as 30,000 steps, although we always run at least 100,000 
steps and discard the first 10% as a "burn in" phase. 



3 TESTING THE Gil MODEL 

We begin by using our MCMC method to explore the re- 
gions of parameter space for which the Gil galaxy formation 
model, applied within a WMAP7 cosmology as in lCuo et al.l 
(2012), is consistent with the observational constraints at 
each redshift, independent of the constraints at the other red- 
shifts. It is interesting that in each case the high-likelihood 



^ Note that we neglect correlations in the scatter between dif- 
ferent data points and different data sets, even though these un- 
doubtedly exist and may be important. 
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Figure 1. Allowed likelihood regions for various semi-analytic parameter pairs at four different redshifts. Star formation efficiency (osf) 
and gas reincorporation efficiency (7) are shown in the four upper left panels; black hole growth efHciency (/bh) a-nd AGN radio mode 
efficiency (fcAGN) in the four upper right panels; gas reheating efficiency (e) and gas ejecti on efficiency (r]) in the four lower left panels; 
star formation efficiency (qsf) and metal yield (zyieid) in the four lower right panels. The lGuo et alj l l20f ll . [iof 2h model is constrained 
independently at each of the four redshifts using observational estimates of the stellar mass function and the K- and B-band luminosity 
functions. Each (logarithmic) parameter space is divided into equal-area pixels and 2D histograms made from our MCMC sampling. 
Solid magenta lines then show equal likelihood contours containing 68% and 95% of all samples (e.g. of the marginalised 2D posterior 
distribution). Colours indicate the maximum likelihood value in each bin (the profile distribution) normalized by the global maximum 
likelihood. The colour scale is labelled by logj^Q L/Lmax- 



region is quite localised in the full 11-dimensional space. 
The observational data at a single redshift are already suf- 
ficient to constrain all the parameters without major de- 
generacies. Furthermore, at z = the high-likelihood region 
includes the parameter set chosen in the earlier paper, show- 
ing that the less formal tuning procedure adopted there did, 
in fact, result in near-optimal parameters. Comparing pre- 
ferred parameter space regions at different redshifts we find 
very substantial overlap. Only one of the 11 parameters, the 
efficiency 7 with which ejected gas from the external reser- 
voir is reincorporated into the hot halo, is required to take 
substantially different values at different times. 

This can be seen clearly in Fig. [T] which shows 
the posterior likelihood distributions at the four redshifts 
marginalised into four different two-parameter subspaces: 
star formation efficiency and gas reincorporation efficiency 



in the upper left panels; black hole growth efficiency and 
AGN radio mode efficiency in the upper right panels; gas re- 
heating efficiency and gas ejection efficiency in the lower left 
panels; and star formation efficiency and metal yield in the 
lower right panels. In each panel, equal likelihood contours 
containing 68% and 95% of the marginalised posterior dis- 
tribution are shown as thick magenta lines, while colours in- 
dicate the maximum likelihood over all the MCMC samples 
in each pixel. In fact, the 95% regions for seven of the eight 
parameters overlap at least marginally across all four red- 
shifts, whereas the reincorporation efficiency (7) is required 
to evolve strongly. The observations indicate that very little 
ejected gas is reincorporated at high redshift, but that rein- 
corporation is substantially more effective at z ^ 2. This 
appears required within the Gil representation of galaxy 
formation physics in order to obtain the observed low abun- 
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dances of lower mass galaxies at z ^ 2 while maintaining 
the higher abundances observed for similar mass galaxies 
at lower redshift. There is some indication that the super- 
nova ejection efficiency t) should be lower at 2 ^ than at 
higher redshift (x-axis on the bottom left panels) but the 
variation required is much smaller than the 4 orders of mag- 
nitude needed for 7 and a reasonable compromise value can 
be found on the edge of the 95% regions (logr; ^ —0.3). 

It is worth stressing the discerning power of this pro- 
cedure. It indicates that most of the physical assumptions 
made in the Gil model are consistent with the observed 
evolution of the galaxy population over 3 > z > 0. In addi- 
tion, it clearly indicates which of the assumptions need to 
be modified and how they should be corrected. The appar- 
ent lack of degeneracies also suggests that, for most of the 
efficiency parameters, evolutionary trends other than those 
currently assumed are disfavored by the observational con- 
straints we have adopted. For example, a star formation ef- 
fici ency that in c reases with decreasing redshift, as proposed 
bv lWang et all (|2012l ) to match the evolution of the num- 
ber density of dwarfs, is not consistent with the data in the 
context of our model. Indeed, in this model star formation 
efficiency primarily affects the gas-to-star ratios of dwarfs 
rather than their stellar masses or st ar formation rates. As 
in some hydrodynamic al simulations (jOppenheimer fc Pavel 
I2OO8I : iHaas et al.ll2012l ). gas simply accumulates in a dwarf 
until it is sufficient to fuel star formation and ejection 
rates which balance the gas supply through infall. Simi- 
larly, changing the scaling of wind properties alone cannot 
produced th e requ ired evolution, as recently found also by 
iBower et al] \2Q1± . 



4 A NEW MODEL FOR REINCORPORATION 
OF EJECTED GAS 

The Gil version of the Munich semi-analytic model assumed 
the gas ejected from haloes to be stored in an external reser- 
voir and to be reincorporated into the system on a timescale 
which depends on halo mass and redshift. The particular 
model adopted was. 



^rcin 



with 



1 

-i 

7 



Kir 



(4) 



(5) 



where Mojcc is the current amount of gas in the reservoir, 
idyn.h ~ Q.lH{z)~^ is the dynamical time of the halo, Kir 
is its virial velocity and 7 is a free parameter. The Kir 
dependence was introduced by Gil to model the higher 
wind velocities (relative to escape velocity) in lower mass 
systems. The scaling with tdyn.ii imposes a redshift depen- 
dence similar to that of the dark matter growth time of the 
haloes, which is nearly independ ent of ma ss but decreases 
quite strongly at earlier times (e.g. lGuo fc White,200^ '). Our 
MCMC analysis shows, however, that the data require 7 to 
be much smaller (and thus reincorporation to be much less 
effective) at z 2 than at lower redshifts. In order to find 
a description of the reincorporation process that can repro- 



3.5 




Figure 2. High likelihood region for the 5\ and &2 parameters in 
the reincorporation model described by equation l(6]l. The model 
is constrained using the observed stellar mass functions and the 
K- and _B-band luminosity functions at z = 0, 1, 2 and 3 simulta- 
neously. The solid lines are equal likelihood contours containing 
68% and 95% of the marginalised 2D posterior distribution, while 
colours indicate the maximum likelihood value over all MCMC 
parameter samples projected within each pixel. The logarithmic 
colour scale is normalized to the global maximum likelihood value. 



duce the data at all times, we re-write equation ((SJ as: 
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and we re-run our MCMC chains constrained simultaneously 
by the observations at all redshifts and including 5\ and ^2 
as free parameters. 

The results of this procedure are shown in Fig. [2] Af- 
ter marginalising over other parameters the high likelihood 
region is quite localised in the 81^-82 plane. The maximum 
likelihood values of these parameters and their ±1 cr range 
are 5i = 2.40lo!33 and 82 = S.OT^Q Je- other parameters 
remain in their plausible ranges for this maximum likeli- 
hood model. If we adopt 8\ = 2.4 and 82 = "A we find, using 
Mvir cx V},JH(z) and tdyn,ii (X H{z)~^ that 
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(1 + ^) 



Mv 



(7) 



n^(i-F2)3-h(i-n™)' 

where Q.m ~ 0.25 is the present-day matter density. The 
redshift-dependent ratio on the right-hand-side of this equa- 
tion varies by less than a factor of 2 for ^ 2: ^ 6 so we ne- 
glect it and adopt a simpler reincorporation model in which 
the relevant timescale is inversely proportional to halo mass 
and independent of redshift: 



tr 



, 10^° Me 
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Mvir 
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where the constant 7' now has units of time. Using our 
MCMC chains to identify the best-fit of this new model to 
our constraining data at all four redshifts, we find Xt ~ 127 
for 134 degrees of freedom and 11 adjusted parameters. For 
comparison, applying the same procedure to the Gil model 
yields Xt = 475 for the same number of adjusted param- 
eters. Thus our modified assumptions about reaccretion of 
ejected gas substantially improve the ability of the model to 
represent the observational data. 

The best-fit value of the reincorporation time-scale, 
7' = 1.8 X 10^°yr, implies that haloes with Mvir lO""'^ M© 
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Figure 3. Illustration of the major changes in the dependence of feedback on galaxy properties between the Gil model and the model 
of this paper. The left panel shows the disk reheating efficiency edisk as a function of maximum circular velocity Vmax. Often referred to 
as the mass-loading factor, this is the ratio of the star formation rate to the rate at which ISM material is heated and injected into the 
hot halo. The middle panel shows the halo ejection efficiency ehalo as a function of Vmax. This is the fraction of the available SN energy 
which is used in reheating disk gas and in ejecting hot gas from the halo. The right panel shows the reincorporation timescale troinc as 
a function of halo virial velocity Vvir and of redshift. In each panel dashed lines refer to the Gil model and solid lines to our new model 
with its best-fit parameters. The blue shaded regions in the left two panels give the 2cr range allowed by our MCMC sampling. Colours 
in the right panel indicate redshift as shown by the label. 



have very short reincorporation times, hence are able to eject 
very little gas at any relevant redshift. In contrast, very lit- 
tle of the gas ejected from haloes with Mvir ^ 10^'' M© ever 
returns. For intermediate- mass haloes, ejection is most effec- 
tive at high redshift because at given halo mass, cooling, star 
formation and feedback are all typically stronger at earlier 
times. We note that higher halo masses at higher redshift 
(see Fig. llUp result in galaxies of given stellar mass hav- 
ing gas return times which are shorter at early times than 
at z = 0. However, the redshift dependence is now much 
weaker than in Gil, and the reincorporation time-scales in 
low-mass galaxies are longer, ensuring that gas is predomi- 
nantly returned at low redshift. The changes are illustrated 
in the right panel of Fig. [31 at 2: = 3 the reincorporation 
time is longer in the new model than in Gil for all haloes 
with Vvii < 140 km/s, whereas at z — this is only true for 
Kir < 30 km/s. 

Our proposed dependence of the gas return time-scale 
on halo mass can also be compared with that found in re- 
cent direct numerical s i mulat ions of the physics of feedback. 
lOppenheimer fc Pavel i 20081 ) found that the gas ejected in 
winds is affected not only by gravitational forces, but also by 
ram pressure against the circumgalactic medium. As a re- 
sult, material is less likely to escape in denser environments. 
These hydrodynamical simulations suggest that the time for 
ejected material to return to a galajcy scales inversely with 
the total mass of the system rather than with the local dy- 
namical time, a very similar result t o that we find here. 
More specifically, lOppenheimer et al.l (|2010l ) found return 
times to scale with M'^^'^ for constant wind speed and with 
M~i^'^' for wind speeds proportional to the virial velocity of 
the system. The latter is more similar to our feedback imple- 
mentation (see Appendix 13 for details). Indeed, we obtain 
a compatible scaling of the return times in our new reincor- 
poration model. Our preferred value for 7' (1.8 x lO^^yr) re- 
sults in a reincorporation time that varies from 1.8 x 10^"yr 
for haloes with Afvir = 10^° M© to 1.8 x lO^yr for haloes 



with Mvir = 10 ■ Mq. These correspond to the time-scales 
fo r ~10% of the gas to be re incorporated in the simulation 
of lOppenheimer et al.l (I2OIOI '). As expected, our time-scales 
are shorter since they correspond to the time for the gas to 
return from the ejected to the hot phase, whereas in the hy- 
drodynamical simulations they correspond to the time for 
a full cycle, beginning when the material leaves the star- 
forming region and ending when it is reincorporated into 
the interstellar medium of the galaxy. 

In passing we note that our new version of the semi- 
analytic code also includes a number of technical changes 
mostly related to book-keeping. These are designed to as- 
sure accurate mass and metal conservation at all times, and 
improved memory management at run-time. These changes 
can have an impact on a few individual galaxies but they 
leave the global properties of the population unchanged. We 
have also introduced a minor modification to the physics of 
infall. In previous versions of the model, if the Afvir value of 
an FoF halo drops during its evolution, then hot gas (includ- 
ing metals) was removed so that the total baryon content re- 
mains equal to the cosmic baryon fraction times Mvir0 We 
relax this assumption here and allow each FoF halo to retain 
all its associated baryons even if this puts its baryon fraction 
temporarily above the cosmic value. Its baryon content then 
begins to increase again only after its Mvir rises above the 
previous maximum value. This change simplifies the treat- 
ment of chemical enrichment, allowing detailed conservation 
of the mass of every element as material shifts between the 
different baryonic components in each FoF halo. It also has 
very little impact on global galaxy properties. 



The Mvir values of FoF haloes generally increase with time but 
can drop temporarily through numerical fluctuations or if they 
pass close to a larger halo. 
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Table 1. Statistics from tlie MCMC parameter estimation. The best-fit values of parameters and their "2-(t" confidence limits are 
compared with the values published inlGuo et alJ ^2012) for their WMAP7 model constrained by 2 = data alone. 





Guol2 


New Model 


2(T lower limit 


2(7 upper limit 


QSF (SF efficiency) 


0.01 


0.055 


0.027 


0.060 


'sAGN (Radio feedback efficiency) 


7.0 X lO"'^ 


3.2 X lO"'"^ 


2.7 X 10"^ 


4.1 X 10"^ 


/bh (BH growth efficiency) 


0.03 


0.015 


0.011 


0.017 


e (mass-loading efficiency) 


4.0 


2.1 


1.8 


2.6 


Hchoat (mass-loading scale) 


80 


405 


315 


473 


/3i (mass-loading slope) 


3.2 


0.92 


0.82 


1.14 


Tj (SN heating efficiency) 


0.2 


0.65 


0.52 


0.87 


Veject (SN heating scale) 


90. 


336 


263 


430 


^2 (SN heating slope) 


3.2 


0.46 


0.38 


0.59 


7' (ejecta reincorporation) 


not applicable 


1.8 X 10"Vr 


1.8 X 10^°yr 


1.9 X lO^^yr 


•^yicld (metals yield) 


0.03 


0.047 


0.040 


0.051 



4.1 Best-fit parameters 

As noted above, with the new reincorporation model of 
equation (|8} it is possible to find parameters that produce 
acceptable agreement with the observational constraints at 
all the redshifts we consider. The resulting best-fit param- 
eters are listed in Table [T] together with their 95% confi- 
dence ra nges. They are c ompared to the parameters pre- 
ferred bv lGuo et all l|2012l) for a fit of the previous model to 
the 2 = data alone, assuming a WMAP7 cosmology. The 
most substantial differences occur for the star formation ef- 
ficiency, which is considerably higher in the new model, and 
for the supernova feedback parameters, with high reheat- 
ing and ejection efficiencies in the new model extending to 
higher mass systems. These changes combine with the new 
reincorporation scheme to ensure that the number density 
of galaxies below is strongly suppressed at early times, 
but grows through the return of ejected gas at later times 
so that the observed abundance of these galaxies at 2: = 
is still reproduced. This later formation increases the low- 
redshift star formation rates of low-mass galaxies, making 
their colours bluer. In Figure [3] we present a graphical com- 
parison of the changes in the relevant efficiencies between 
Gil and this work. 

In our new model the mass-loading of the winds which 
eject gas from galaxy disks into their hot haloes depends less 
strongly on the rotation velocity of the disks and is larger 
for all but the smallest disks than in the Gil model. (This 
is Edisk in equation IA5I its scaling with Vmax is illustrated 
for the old and new models in the left panel of Figure O) 
The values now required are similar to those inferr ed from 
observational data both at low and at high redshift |Martinl 
1 19991 : IStrickland fc HeckmanllioogI : ISteidel et al]|20m 

As can be seen from the middle panel of Figure [3l the 
values now required for the parameters in equation (|A8|) im- 
ply that all the available energy from supernovae is used to 
reheat and eject gas in typical disk galaxies (i.e. euaio = 1 
for Vmax < 300 km/s). Arguably, this efficiency is unsat- 
isfactorily high since observations of supernova remnants 
suggest that a significant fraction of their kinetic energy is 
thermalised and radiated away. Nevertheless, we note that 



within our 2(j allowed region ey,aio can be as low as 60% for 
galaxies like the Milky Way, and in any case the total energy 
available to drive winds could exceed that we have assumed 
by a factor of two or more. For example, we have considered 
only supernova feedback, neglecting input due to radiation 
and winds from massive stars. 

Finally, as already mentioned briefiy above, the third 
panel of Figure |3] shows that the reincorporation times for 
gas ejected from a halo depend more strongly on halo virial 
velocity K,ir than in Gil, and that at fixed Vvi-r the depen- 
dence on redshift is reversed. Thus at z = reincorpora- 
tion is faster in the new model than in Gil for all galaxies 
with Vvir > 30 km/s, whereas at z = 3 this is only true for 
Vvir > 150 km/s - reincorporation is substantially slower at 
virial velocities typical of dwarf galaxies. 

The increase in star formation efficiency (SFE) relative 
to the model of IGuo et all l|2012l) primarily affects inter- 
mediate mass galaxies since objects of lower masses have 
very short cooling times and star formation rates are driven 
primarily by the infall of new gas and the reincorporation 
of ejecta. At high redshift, the increased SFE compensates 
for the more efficient supernova feedback to maintain the 
abundance of objects. At lower redshift, the effect is 
neutralized by the higher AGN radio mode efficiency in the 
new model, so that the high mass cut-off in the mass func- 
tion is unchanged. The star formation efficiency parameter 
QSF regulates the conversion of cold gas into stars, so it is 
important to note that, despite the new efficiency being ~ 3 
times fiighcr than in Guo ct al. (2011) and ~ 5 times higher 
than in IGuo et al.l (|2012l ) , the predicted cold gas fractions 
of low-redshift spiral and irregular galaxies are quite com- 
parable. The higher conversion rate from gas into stars is 
balanced by the increased amount of gas available due to 
the delayed reincorporation of ejected gas. Finally, the new 
best-fit requires a higher value for the metal yield than in 
Gil. This parameter is primarily constrained by the relative 
abundances of bright galaxies in the K- and B-bands, i.e. 
by the colours of massive elliptical galaxies. 



Simulations of the galaxy population: implications for galactic winds and the fate of their ejecta 9 



"o -3 



^Combined doto sets used In MCMC 




SMF, z = 3 



— This Work 
This Work 

- -G1 1, WMAP7 + err 



11 10 9 

log,o(M/Me) 



9 
log,o(M/Me) 



Figur e 4. Evolution of the stellar mass function from z = 3 to z = 0. Theoretical predictions from our new model and from lGuo et alj 
1120121 ) are shown as solid and dashed red lines, respectively. The dotted red lines in the z ^ 1 panels are for the new model before 
convolution with a log- normal distribution of width 0.25 dex representing random uncertainties in the observed stellar masses. Blue dots 
with error bars show the observational constraints and their 1 — a uncertainties as adopted for the MCMC analysis . These resu l t from 
combining a variety of datasets at each redshift. The individual data points are shown in Fig.[Cl] The 2 = results of lLi fc Whit3 l|2009l ) 
are repeated at all redshifts as a black dotted line. 



5 CONSISTENCY WITH THE INPUT 
CONSTRAINTS 

In this section we compare the theoretical predictions of 
our new model with the observational properties used as 
constraints in our MCMC analysis: the stellar mass func- 
tion, and the K- and B-band luminosity functions from 
2 = 3 to 2 = 0. All magnit udes are AB, the stellar pop- 
ulations are iMarastonI (120051 ) and the WMAP7 cosmology 
is adopted. For all properties we also compare with the pre- 
dictions of the Gil model tuned to the 2 = data alone , 
assuming this same WMAP7 cosn iology iIGuo et al.|[201^ ) 
and re-calculated luminosities for iMarastonI l|2005h stellar 
populations. 



5.1 Stellar mass functions 



IGuo et al] (|201ll ) found good agreement between the stellar 
mass function of their model and the 2 = observations that 
they used to tune its parameters. However, an increasing ex- 
cess of low-mass objects was found when model predictions 
were compared to data at higher redshifts. This excess was 
not a consequence of the high value of erg adopted for the 
original Millennium simulations, since it remained equally 
strong wh en the simu l ations were scaled to a WMAP7 cos- 
mology bv lGuo et ai] l|2012l ). 

The observed number densities of massive and infrared- 
bright galaxies are reproduced equally well in both cosmolo- 



gies, once allowance is made for the substantial random er- 
rors ex pected in the st ellar masses of high- redshift objects 
(i Henriques et al.|[201^ . Within the Gil model it appears 
impossible to retain this agreement, while fixing the abun- 
dance evolution at lower mass. Here we test how well our new 
model for the reincorporation of ejected gas addresses this 
issue. Note that we are attempting to build an a priori phys- 
ical simulation which reproduces the observed abundances 
of galaxies over five orders of magnitude in stellar mass and 
over the last five sixths of the age of the Universe. 

In Fig. U we show the evolution of the stellar mass func- 
tion from 2 = 3 (lower right panel) to 2 = (top left panel). 
Results from Gil and from our new model with its best-fit 
parameters are shown as dashed and solid red lines, respec- 
tively. The dotted red line represents the new model before 
convolution with a log-normal distribution of width 0.25 dex 
representing random uncertainties in the observed stellar 
mass estimates. The shift between dotted and solid red lines 
thus shows the predicted effect of Eddington bias on the 
observed distribution. The models are compared with the 
observational constraints adopted for our MCMC analysis, 
which are derived from a comprehensive set of high-quality 
data. The individual datasets are shown in Appendix [Cl 

The effect of modifying the treatment of gas reincorpo- 
ration time is clearly visible in Fig. |4l The delayed return 
of gas ejected in winds reduces the amount of gas available 
for star formation at early times. The effect is pronounced 
in low-mass galaxies, leading to a reduction in the number 
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Figure 5. Evolution of the rest-frame ii'-band luminosity function from z = 3 to z = 0. Results for our new model and for Gil are shown 
as solid and dashed red lines, respectively. Blue dots with error bars show the observational constraints and their 1 — a uncertainties as 
adopted for the MCMC analysi s. These r esult from combining a variety of datasets at each redshift. The individual datasets are shown 
in Fig. IC2I The 2 = results of ljones et al.i (i2006i ) are repeated at all redshifts as a black dotted line. 



density at given stellar mass by almost a factor of two at 
z ^ 2. The enhanced reincorporation of ejected material at 
later times provides additional fuel, allowing low-mass galax- 
ies to form enough stars to continue to match the observed 
abundances at z — 0. 

The model also matches the moderate but still notice- 
able evolution of the abundance of massive galaxies. This 
agreement contrasts with earlier work where the evolution 
in the observed abundances of high mass galaxies appeared 
much weaker than p redicted by hierarc h ical models (e.g. 
Fontana et"all l2006l : ICaputi et al] l2006l : iMarchesini et all 
The discrepancy can be explained by the combina- 
tion of two effects. For high-redshift galaxies, photometric 
redshifts and stellar mass estimates obtained by SED fit- 
ting have substantial uncertainties which extend the high 
mass tail of the distribution of estimated stellar masses. In 
addition, if masses are derived using population synthesis 
models that omit the near-infrared emission expected from 
intermediate age stars, the fluxes of high-redshift galaxies 
will be interpreted as indicating overly high masses. Once 
these two effects are taken into account, both in correcting 
the data and in tuning the model, quite reasonable agree- 
ment is obtained. 



5.2 The rest-frame if-band luminosity function 

Most previous galaxy formation models have failed to match 
the observed ev olution of the faint end of the K-hand lumi- 
nosity function (|Kitzbichler fc White! [2007l :[ Cirasuolo et al.l 



I2OIOI : iHenriaues et al] I2OIII . I2OI2I : ISomerville etal] boill l. 

This is a direct consequence of the overly early build up 
of the dwarf galaxy population seen in the stellar mass 
function. At the other extreme, model predictions for the 
evolution in abundance of the brightest K-hand galaxies 
have only quite re c ently been reconci l ed with observations 
dBowcr et al."2008'; Tonini et al. 2010'; Fontanot & Monacj 
[2OIO; Henriques et al. 2011 . 2012; Somcrvillc et al. 2011). 

In Figure (5] we compare the predicted evolution of the 
if-band luminosity function out to redshift z — 3 with the 
observational data we used as constraints for our MCMC 
chains. Results from our new model and from Gil are shown 
as solid and dashed red lines, respectively, in both cases 
assuming iMarastonI (|2005l ') stellar populations. The obser- 
vational constraints are based on combining results from 
a number of studies, which are shown individually in Ap- 
pendix [C] Once the new reincorporation prescription is in- 
cluded, our model agrees reasonably well with the obser- 
vational data over the full luminosity and redshift ranges 
considered. The reduced number density of dwarfs at high 
redshift caused by the delayed reincorporation of gas has a 
direct impact on the abundances of faint near-infrared galax- 
ies. This corrects the previous excess of faint objects while 
maintaining agreement at the bright end. 

The most obvious remaining discrepancy is a residual 
overabundance of faint galaxies at z = 2. Within our present 
model this cannot be corrected without worsening the good 
fit to the stellar mass and _B-band luminosity functions of the 
corresponding z — 2 galaxies and their z = 3 progenitors. 
It is plausible that the problem stems from residual system- 
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atic errors in one or more of these observational datasets, 
but it could also reflect a problem in the stellar population 
modelling of 1 Gyr old populations. 

Despite the considerable uncertainties that still remain 
in theoretical predictions, it is interesting to note that the 
evolution of the bright tail of the near-infrared luminosity 
function of galaxies can be matched by a model that simulta- 
neously reproduces the observed evolution of the high-mass 
tail of the stellar mass function. Until recently, stellar pop- 
ulation synthesis models assumed near-infrared emission to 
come predominantly from old populations. If this were true, 
the weak evolution in the characteristic luminosity L^. at 
rest-frame K from 2; = to z = 3, would imply substan- 
tially less evolution of the characteristic mass of the stellar 
mass function over this same redshift range than is expected 
in hierarchical models of the kind we study here. If, how- 
ever, intermediate age population s also emit significantly in 
the near-infrared j|Marastonll2005l : ICharlot fc Bruzual|[2007l : 
IConrov et al]|2009i r then the observed luminosities of the 
high-redshift galaxies correspond to significantly lower stel- 
lar masses, and the observed evolution of the stellar mass 
and near-infrared luminosity functions can be matched si- 
multaneo usly. (This i s alre ady apparent from the com bined 
results of iGuo et al] I2OIII and iHenrigu^ et al. 2012.) Ac- 
cept able results are obta i ned f or both thejMaraston (2005) 
and ICharlot fc Bruzual ||2007|) stellar popul ation models. 



and lunarlot &: pruzuail llzUU< | ) stellar popul ation models, 
but using the older iBruzual fc CharlotI j2003D models leads 
to a significant underprediction of L* in the ii'-band at 
higher redshift. 



5.3 The rest-frame B-band luminosity function 

Much of the light emitted in the B-band is produced by 
young stars in star-forming regions. In addition, popula- 
tions become bluer with decreasing metallicity and so emit 
proportionately more in the B-band. Finally, star-forming 
galaxies also contain dust which absorbs strongly at B. 
Thus, matching observations of the B-band luminosity func- 
tion and its evolution requires detailed modelling of star for- 
mation, chemical enrichment and dust production. In our 
semi-analytic model, star formation happens both quies- 
cently and in merger-induced bursts, and dust is modelled 
separately for the general interstellar medium and for the 
birth clouds of young stars. Dust extinction is taken to be 

J iroportional to the column density of cold gas in both cases 
De Lucia fc BlaizotllioOTi ). Metals are returned to the cold 
gas phase after each star formation episode and later fol- 
low the gas as it is transferred between the various b a ryonic 
phases (|De Lucia et al.l I2OO i). iKitzbichler fc White! (|2007| ) 
found that although an earlier version of the Munich model 
agreed well with the observed B-band luminosity function 
at 2 = 0, it significantly underpredicted B-band luminos- 
ity functions at high redshift. They corrected this in an ad 
hoc way by lowering the dust-to-cold gas ratios adopted at 
early times. The evolution of this r atio with time was slightly 
modified bv IGuo fc White! (|2009l ) and included in the Gil 
version of the Munich model, leadi ng to similar results fo r 
high-redshift luminosity functions (|Henrioues et al.ll2012l ). 
and it is also included here. 

In Figure (6] we compare predictions for B-band lumi- 
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nosity functions from z — <i to z — Q with the observa- 
tional constraints used in our MCMC study. The individual 
datasets combined to obtain these constraints are shown in 
Appendix [C] Results from our new model with its best-fit 
parameters and from the original Gil model are shown as 
solid and dashed red lines, respectively. At z ~ 0, the model 
is compared to the 6j-band luminosity funct ion of the 2dF 
surve y assummg bj = B - 0.267(B - V) (" Norberg et al.l 
I2OO2I ). at higher redshift the luminosity functions are for 
Johnson B. The delayed formation of low-mass objects in 
our new model reduces the amplitude of the high-redshift 
luminosity functions at fainter magnitudes, but the large 
observational uncertainties mean that both models are con- 
sistent with observation up to 2; = 2. At « = 3, both mod- 
els underpredict the observed abundance at all magnitudes, 
especially the brightest ones. Errors in the treatments of 
star formation, enrichment and dust obscuration, as well as 
uncertainties in the stellar population models, may all con- 
tribute to this discrepancy. 



6 MODEL PREDICTIONS 

In this section we show model predictions for galaxy proper- 
ties other than those used as constraints in our MCMC sam- 
pling. These include rest-frame colours, star formation rates, 
ages, stellar metallicities, clustering, and the relation be- 
tween halo mass and stellar mass. We focus on low-redshift 
data, except that we analyse the evolution of the halo-mass- 
stellar-mass relation from 2: = 3 to z = 0. In most cases, we 
compare the predictions with data from the Sloan Digital 
Sky Survey (SDSS) because the very large volume of the 
survey results in small statistical and cosmic variance un- 
certainties. Our goal is to investigate whether our improved 
representation of the evolution of stellar mass and luminos- 
ity functions affects any of the other discrepancies between 
the Gil model and observed galaxy populations. We find 
that at 2 = low-mass galaxies in our new model have 
higher star formation rates, younger stellar populations, are 
more often blue and are less clustered on small scales than in 
the Gil model. In almost all cases this improves the agree- 
ment with the observed SDSS galaxy population. Since these 
properties were not used as constraints when setting param- 
eters, this should be viewed as a significant success for the 
new model. 



6.1 Rest-frame colours 

Galaxy colours at optical wavelengths are affected by a large 
number of physical factors, making them difficult to model. 
In particular, current specific star formation rates, star for- 
mation and chemical enrichment histories, the later stages of 
stellar evolution, and the amount and distribution of dust 
can all have substantial effects on optical colours. Despite 
this complexity, the observed population of low-redshift 
galaxies separates fairly cleanly into a "red sequence" of rel- 
atively massive galaxie s and a "bl ue cloud" of predominantly 
low-mass dwarfs ( Kau ffmann et al.,2003 : .Baldry et a l. 2004; 
IWeinmann et all booe! )" In the Gil model and its prede- 
cessors, the majority of dwarf galax;ies are much redder 
than observed, w hile massive galaxies are not red enough 
(|Guo et al.ll201lh . Clearly, star formation is truncated too 



early in the majority of model dwarfs, an effect which is lit- 
tle influenc ed by switching from a WMAPl to a WMAP7 
cosmology (jCuo et all 12012 ). The ma ssive galaxies do form 
most of their stars at high redshift fe.g. lDe Lucia et al.ll200d ) 
so their overly blue colours most likely reflect a metallicity 
which is too low. 

Figure [7] shows the distributions of u — i colour for 
eight stellar mass ranges from 10* M© to 10^^ Mq at z = 0. 
The new model with its best-flt parameters (solid red line) 
is compared with the Gil model (dashed red line). These 
histograms use data from the Millennium-II simulation for 
10** M0 < M* < 10^-^ M0 and from the M illennium s i mula- 
tion for 10^'^ Mq < M* < 10^^ Mq. As in lGuo et al.l (|201ll . 
1201 j) they are compared with data from SDSS/DR7, in- 
cluding 1/Vmax corrections so that the thin black histograms 
correspond to volume-limited statistics. All histograms are 
normalized to have unit integral. There is a clear distinction 
in the observed distribution between dwarfs, which are al- 
most all blue, and massive galaxies, which are almost all red. 
Only around the transition at Af, ~ 10^"" M© do both pop- 
ulations appear to exist in comparable numbers. Our new 
best-flt model reproduces these trends signiflcantly better 
than the Gil model. The delayed reincorporation of gas and 
hence the increased fuel supply at late times results in the 
majority of low-mass galaxies continuing to form stars until 
2 = 0. Despite this, the number of passive dwarf galaxies in 
the model still substantially exceeds that observed. 

This excess of passive dwarfs reflects, at least in part, 
the fact that we impose a threshold for star formation (see 
equations lAll IA2I and IA3[) since a substantial fraction of 
the red low- mass galaxies have cold gas masses c lose t o 
merit. These equations date back to Croton et al. l[2006h . 
who based them on results from iKennicutd ( 1998 ^ and 
iKauffmannI l|l996l l. and they clearly need updating to take 
account of recent improvements in our observational under- 
standing of the regula rities under lying large-scale star for- 
mation in galaxies (e.g. iBigiel et~al.,.2008il. Firs t steps in this 
direction have been made bv lFu et all (|2012l ) who include 
a prescription for atomic to molecular gas conversion, and 
assume star formation to be proportional to the molecular 
content. The observed blue population of low-mass galax- 
ies extends to signiflcantly bluer colours than predicted by 
the model. This appears to reflect the fact that many of the 
observed systems have higher speciflc star-formation rates 
than any star- forming dwarfs in the model (see Fig. [9]). 

For massive galaxies, the redder colours we obtain with 
our new model are closer to those observed than are the 
colours from Gil. As discussed in the next section, this is a 
consequence of the higher metal yield in our best-flt param- 
eter set. This signiflcantly increases the metal abundance of 
these objects. In general, our theoretical distributions are 
much narrower than the observed distributions. This may 
reflect overly narrow metallicity distributions in the model, 
dust effects or observational errors in the observations. 



6.2 Stellar metallicities 

All chemical elements heavier than lithium are produced and 
deposited in the gas phase during the later stages of stellar 
evolution. The amount of metals in future generations of 
stars is then strongly influenced by mixing processes and by 
exchanges of material between the cold interstellar medium. 
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Figure 7. Histograms oiu — i colour for ^ = galaxies in eight disjoint stellar mass ranges. Results for our new best-fit model (solid red 
lines) are compared with results from Gil (dashed red lines) and with low-redshift data from SDSS/DR7 (solid black lines). The models 
are based on the Millennium-II simulation for stellar masses below 10^'^ Mq and on the Millennium simulation for higher masses. 
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Figure 8. The stellar mass-stellar metallicity relation at 2 = 0. 
Our new model with its best-fit parameters (solid red lines) is 
compared with Gil (dashed red lines) and with SDSS data from 
iGallazzi etahl l l200d) (blue squares and lines). In all cases, the 
central lines represent the median metallicity in each mass bin 
(blue squares for the observations), while the upper and lower 
lines represent the 16th and 84th percentiles of the distribution. 



the hot gas atmospheres of galaxies, groups and clusters, 
and the circumgalactic medium. It follows that changes in 
modelling of the physics of gas ejection and reincorporation 
will be reflected in the stellar mass-metallicity relation. In 
Fig. |S] we show this relation at 2 = for our new model 
(solid red lines) for the model of G il (dashed red lines) 
and as estimated from SDSS data by IGallazzi et all (|2005l . 
blue squares and lines). For each sample, the median and 
the 16% and 84% pe rcentiles are shown a s a function of 
stellar mass. Note that IGallazzi et al] (|2005l ) estimate the 1- 
a uncertainties in their SDSS metallicity estimates to range 
from 0.6 dex at 10^ Mq to 0.2 dex above lO" Mq. This can 
account for much but not all of the difference in scatter 
between the models and the observations. The optical to 
near-infrared colours of galaxies are strongly influenced by 
their metal content, so the constraints we applied on stellar 
mass and luminosity functions are reflected in the model 
mass-metallicity relation. As a result, despite the differences 
in the feedback physics, the predicted distribution shapes 
are very similar in the two models. Both follow the observed 
trend approximately, although with a signiflcantly shallower 
slope. 

A noticeable difference between the two models is the 
higher overall metallicity in the new best-fit. This produces 
massive galaxies with abundances closer to those observed, 
but also somewhat overpredicts the metallicities of dwarfs. 
This is a consequence of the higher yield preferred for the 
new model by our MCMC chains (0.047 as opposed to 0.03 
in Gil). The higher metallicity is responsible for the redder 
colours of massive objects, as shown in the previous section, 
for the increase in the number density of bright 7^-band 
galaxies, and for a reduction in the abundance of bright 
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Figure 9. The distribution of specific star formation rate (left 
panel) and r-band weighted age (right panel) for galaxies with 
IO^Mq < Kh < 10^-5 Mq. Predictions for our new model with 
its best-fit parameters are shown as solid red lines, while results 
from Gil are shown as dashed red lines (both are based on the 
Millennium-II simulation). The models are compared with SDSS 
data; specific star formatio n rate estimates come from Salim ct al. 
(|2007|1 and stellar ages from lGallazzi et al.l ||2005|'| . Both are shown 
as solid black lines. 



S-band objects at z = (see Sections 15.21 and 15.31 respec- 
tively). 

If the uncertainty estimates of lGallazzi et al.l (|2005| ) are 
accurate, the intrinsic scatter in the observed relation is con- 
siderably larger than in the model. This might explain, at 
least in part, why the red-sequence and blue-cloud popu- 
lations show less scatter in colour in the models than in 
the real data (Fig. [7]). The steepness of the observed me- 
dian relation is also puzzling, particularly since the models 
seem to agree with the observed relation between gas-phase 
metallicity and stellar mass (see Gil - the current model 
produces similar results). The model slopes in Fig. |8] are 
quite simila r to those found by other pu blished semi-analytic 
models (e.g lDe Lucia fc Borganillioia ). At the massive end, 
a closer fit to observation might be obtained by more ef- 
fective tidal disruption of satellites ijHenriaues et al.l boosi : 
iHenriaues fc ThomasI [20T0I ) : the high metallicities of stars 
formed in situ are then less diluted by stars accreted through 
minor mergers because these are instead assigned to the intr- 
acluster light. We note, however, that the Gil model already 
includes a substantial intracluster light component in rich 
clusters. For stellar masses below ~ 10^° M©, strong feed- 
back from super novae is crucial in order to keep the metal 
abundances low llBertone et al.l [ I2OO7I : IHenriaues fc ThomasI 
l2010l : lDe Lucia fc Borganill2012l ). 



6.3 Ages and specific star formation rates in 
dwarf galaxies 

In a recent paper, IWeinmann et all (|2012l ) analyzed the 
overly early build-up of dwarf galaxies in semi-analytic mod- 
els and in SPH simulations by studying their ages and 
star formation rates at 2 = 0. As expected, the prema- 
ture formation of dwarfs leads to older ages and lower star 
formation rates than observed in the local Universe, with 
similar discrepancies seen i n the Gil model and in th e 
hydrodynamic simulatio ns of lOppenheimer fc Pavel (|2008l ). 
IWeinmann et al.l (|2012l ) focused on central galaxies, since 
including satellites would require treatment of additional 
physical processes and so introduce extra complications 



into the analysis. Nevertheless, they do note that the dis- 
agreement between star formation properties in the mod- 
els and those observed is considerably worse for satel- 
lites t han for centrals, as already shown clearly by earlier 
work IWeinmann et al.| [2009: Wang ct al. 2qi3). Using their 
own semi-analytic model, .Bower et al., (,20121 ) found simi- 
lar trends when looking at the number density of low-mass 
galaxies at high redshift and their star formation rates at 
z = Q. Our new model correctly predicts the number den- 
sity of low-mass galaxies as a function of cosmic time, so it 
is interesting to see if these discrepancies persist. In this sec- 
tion we compare model predictions with data for all types 
of galaxies, including satellites. 

In Fig. [9l we compare the distributions of specific star 
formation rate (left panel) and r-band luminosity-weighted 
age (right panel) for low-mass galaxies (10^ M0 < Mi, < 
10^'^ Mq) with observations at 2; = 0. Our new model 
with its best-fit parameters is represented by the solid 
red lines while results from Gil are shown as dashed red 
lines. SDSS data from the MPA-JHU catalogU(|f] are plot- 
ted as solid black lines after applying a 1/T^ax correction 
to convert them to volume-limited statistics. The star for- 
matio n rate data are taken from SDSS-DR7 (|Salim et al.l 
l2007h while the luminosit y-weighted ages are from SDSS- 
DR 4 (iGallazzi et al.ll2005l). Stellar mass estimates follow- 



ing iKauffmann et aLM 20031 ) are used in both cases. The 
longer reincorporation time-scales for ejected gas in our new 
model delay the growth of low-mass galaxies and shift star- 
formation activity to significantly lower redshifts. As can be 
seen in the left panel of Fig. |5]this increases specific star for- 
mation rates at z = by almost a factor of three and brings 
them into relatively good agreement with those observed. 

This later star formation results in substantially 
younger luminosity-weighted ages. In the right panel of 
Fig. m the mode of the distribution for low-mass galaxies 
drops by almost a factor of two to ~ 2.5 Gyr. This is in 
considerably better agreement with SDSS data, although 
a substantial further reduction still seems to be required. 
Note also the long tail to "old" ages in the model histograms 
which is almost absent in the observations. This corresponds 
to the red (primarily) satellite population seen in the rele- 
vant panel of Fig. [T] We note that the age discrepancy for 
the blue population between th e Gil model and the S DSS 
data appears larger here than in lWeinmann et"aLl (|2012l ) be- 
cause that paper compared l/-band weighted ages from the 
models with r-band weighted ages from SDSS. 

6.4 Stellar-mass — halo-mass relation 

The amount of cold gas, and hence of fuel for star formation, 
in a galaxy depends directly on the mass of its halo. At early 
times the ratio of baryons to dark matter is almost uniform 
on the scales which build galaxies, and although later evolu- 
tion modifies this proportionality considerably, distributing 
the baryons differently between the stars and the various 
gas components, it does not erase the strong trend for more 
massive haloes to have more massive central galaxies. For 
this reason, a monotonic relation between the stellar mass 
of a central galaxy and the mass of its halo is expected to 

^ http:/ /www. mpa-garching.mpg.de/SDSS/ 
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Figure 10. Stellar mass as a function of maximum past halo mass 
for the galaxy populations present at redshifts 3, 2, 1 and 0. Filled 
circles show median values at each halo mass for our new model 
with its best-fit parameters, with error bars indicating the ztlcr 
scatter. For comparison, solid lines show the relations obtained at 
these same re dshifts (coded by col or) using abundance matching 
techniques by iMoster et al. ] ||2012| '). The dashed blue line is the 
median relation at 2 = 3 for the Gil model. 



be a good first approximation to the outcome of the galaxy 
formation process. Since the halo of a satellite galaxy can 
be strongly truncated by tidal effects without a correspond- 
ingly large reduction in the mass of its central galaxy, the 
stellar mass of a satellite is expected to be more tightly re- 
lated to the maximum past m ass of its (sub)halo th an to its 
current subhalo mass (see e.g. iReddick et al.ll2012l ). 

Under the assumption of a monotonic relation between 
central galaxy stellar mass and maximum past halo mass, 
this SM-HM relation can be obtained explicitly by com- 
paring the observed abundance of galaxies as a function of 
stellar mass with the simulated abundance of subhaloes as 
a function of past maximum mass (see Fronk et al.. 1988i: 
Vale fc Qstriked l2004l: ICoiirov et al] bOodTlBehroozi et al 



20101 : iMoster et aU bOlOl : IGuo et all l2010l : lReddick et al 
20121 . for various forms of this abundance matching argu- 



ment). This assumes, of course, that the real Universe con- 
forms to the ACDM model, and, in addition, that all real 
galaxies correspond to a subhalo which survives until the 
relevant redshift in the dark matter simulation being used. 
This latter assumption is seriously violated in the galaxy 
formation models of Gil and this paper (see, for example. 
Fig. 14 in IGuo et aLllioTH ) although the violation does not 
dramatically affect the SM-HM relation. 

Although the (sub)halo abundance matching method 
appears very successful at producing mock catalogues which 
match observation, it has serious drawbacks. In the first 
place, since the observed abundance of galaxies is used as in- 
put to the scheme, it cannot be used to constrain the physics 
of galaxy formation. This has to be done indirectly using 



the derived SM-HM relation, the clustering of galaxies and 
their evolution. Secondly, the stellar masses of galaxies un- 
doubtedly do depend on aspects of their assembly history 
other than the maximum past mass of their haloes, for exam- 
ple, whether they are currently central or satellite galaxies, 
whether they recently experienced a major merger, and so 
on. These additional dependences may bias mock catalogues 
constructed by abundance matching, and thus the cosmolog- 
ical parameter values inferred by comparing their large-scale 
structure to that observed. Such biases can only be identified 
through models which explicitly follow galaxy formation. 
Here we compare the SM- HM relation p r edicte d by our new 
model to that inferred bv IMoster et al.l (|2012l ) who applied 
an abundance matching technique self-consistently across 
multiple redshifts using the Millennium and Millennium-H 
simulations. Note that they corrected for "missing" sub- 
haloes using the Gil model, rather than matching to the 

subhaloes detected in the simulatio ns alone. 

Fig. [To] compares results from IMoster et al.l l|2012t ) for 
redshifts 0, 1, 2 and 3 (solid lines) to the SM-HM relation in 
our new model (filled circles give the median central stellar 
mass at each halo mass, while "error bars" indicate the ±lcr 
scatter). Different colours encode redshift as indicated by 
the label. Model predictions are based on the Millennium 
Simulation for Mvir > lO'^^ M© and on the Millennium-II 
Simulation for smaller masses. For comparison, we show the 
relation for the Gil model at 2; = 3 as a dotted blue line (at 
2 = the Gil prediction is very close to that of our new 
model). For the new model, th e agreement with t he SM- 
HM relation inferred directly bv lMoster et al.l l|2012l ) is very 
good. This is, of course, expected since the new model is a 
good fit to the high-redshift stellar mass functions we used 
as constraints, and these are very similar to the observa- 
tional results used as input bv lMoster et al.l l|2012l ). Notice 
that the characteristic mass at which the SM-HM relation 
changes slope now shifts to higher mass with increasing red- 
shift, whereas it remained approximately constant in the 
Gil model. As a result, stellar masses are substantially lower 
in the new model at 2 = 3 for haloes of mass Mvir < 10'^ Mq 
than in the Gil model, as required to reduce the abundance 
of intermediate- and low-mass galaxies to the observed level. 
It is interesting that the 2 = 3 relation lies below the 2 = 
relation almost everywhere, showing that galaxy formation 
was actually less efficient at given halo mass at 2 = 3 than 
today, despite the 64 times higher density within haloes at 
that time. 



6.5 Galaxy clustering 

The spatial distribution of galaxies provides many impor- 
tant tests for galaxy formation models. Clustering measure- 
ments hold information about the cosmological parameters, 
the statistics of the initial conditions, the nature of dark 
matter, the way in which galaxies grow within dark matter 
haloes, the way in which they are modified by environmental 
influences, and, on small scales, the rates of galaxy collisions 
and mergers. The latter are of particular relevance for semi- 
analytic models, since the limited resolution and the absence 
of baryons in dark matter simulations cause subhaloes to be 
destroyed by tides well before their central galaxies should 
themselves be disrupted or merge into the central galaxy of 
their parent halo. 
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Figure 11. Projected 2-point autocorrelation functions of galaxies in six disjoint stellar mass ranges. Results from our new model with its 
best-fit parameters (solid red lines) are compared with results from Gil (dashed red lines) and with observational data from SDSS/DR7 
(solid black lines). The Millennium-II Simulation is used for stellar masses below 10^ '^'^ Mq and the Millennium Simulation for higher 



The Gil model ties the position of each such orphan 
galaxy to that of the dark matter particle which was most 
bound within its subhalo at the last time this could be iden- 
tified. At later times, the orphan's position is a weighted 
average of those of the particle and of the central galaxy 
of the main halo, coinciding with the particle at subhalo 
disruption and with the central galaxy a dynamical fric- 
tion time later, at which point orphan and central galaxy 
merge. This scheme produces better agreement with ob- 
served small-scale galaxy clustering than previous cruder 
models. Gil found it to reproduce the autocorrelations of 
galaxies with logM*/M0 > 10.8 on all scales larger than 
30 kpc. However for smaller stellar masses they found ex- 
cess clustering on scales below 1 Mpc. This discrepancy was 
significantly smaller for a WMAP7 cosmology than for the 
original WMAPl cosmology, but remained noticeable. 

In Figure [TT] we show projected autocorrelation func- 
tions for galaxies in several disjoint stellar mass ranges. 
Results for our new model with its best-fit parameters are 
shown as solid red lines while results from Gil are shown 
as dashed red lines. For both models, results from the 
Millennium-Il Simulation are plotted for log M*/ Mq < 9.77 
and from the Millennium Simulation for higher masses. 
Model result s are compared with the same SDSS /DR7 
data used in IGuo et all l|201ll ) and IGuo et al] ||2012D . The 
new model now accurately reproduces the observations 



for masses above 10^"'^^ Mq and for projected separations 
larger than 20 kpc. The later assembly of stellar mass 
caused by our modified reincorporation model reduces the 
masses of satellite galaxies relative to those of centrals, be- 
cause the satellites lose their ejecta reservoirs before the gas 
can be reaccreted. This weakens the "l-halo" contributions 
to the autocorrelation functions and eliminates the small- 
scale clustering excess for 8.77 < logM^/M© < 9.27 and 
10.27 < logM*/M0 < 10.77. In the two intermediate mass 
bins, the reduction is actually too strong and the new model 
falls significantly below the observations. These results show 
that small-scale clustering is quite sensitive to galaxy for- 
mation physics, and as a result it is desirable to include 
clustering data among the constraints when using MCMC 
sampling to explore galaxy formation issues. 



7 CONCLUSIONS 

Semi-analytic galaxy formation models are designed to ex- 
plore how the many astrophysical processes which shape 
the formation and evolution of galaxies are refiected in the 
systematic properties of the observed galaxy population. 
In this paper we have extended the Monte Carlo Markov 
Chain (MCMC) metho dolo gy introduced in this context by 
iHenriques eTal] l|2009l ) and (Henriques fc Thom3 (|2010l ) to 
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allow population properties at a wide range of redshifts to be 
used simultaneously as constraints. This exploits one of the 
strong points of the semi-analytic programme, the fact that 
its predictions at all redshifts are, by construction, consistent 
both with the specific physical assumptions of the model and 
with the growth of structure in a ACDM universe. 

As observational input, we decided to use stellar mass 
functions and rest-frame B- and if -band luminosity func- 
tions at 2 = 0, 1, 2 and 3. We combined recent high qual- 
ity observational determinations to obtain a uniform set 
of constraints with plausible and consistent estimates of 
their uncertaintie s. We adopted the modelling framework of 
iGuo et all l|201ll ) and used our optimised MCMC scheme 
to explore allowed values for 11 of its parameters, those 
governing the physics of star formation, black hole growth, 
feedback-driven gas flows and chemical enrichment. 

When constrained using data at z = alone, the 
MCMC modelling gave best-fit parameters close to those 
preferred by Cll. Similarly good fits were obtained when 
the data at any other single redshift were used as con- 
straints. However, no single parameter set could provide 
a good fit at all four redshifts. The allowed regions at 
different epochs overlap for most parameters, but the 
timescale for reincorporation of ejected gas is required 
to vary differently with redshift than in the Gil model. 
This explains why this model, like most other physi- 
cally based galaxy population models (see, for example, 
Fontanot eral]l2009l: ICirasuolo et"ai]|2010l:[Sonierville et al ' 



201ll : iHenrigues et al.' '20121: ICuo et al.l |2012| : iBower et al 



20121 : IWe inmann ct al. 2012) overpredicts the abundance of 



lower mass galaxies at high redshift once parameters are set 
to fit the present-day galaxy population. 

In order to identify the modifications required to match 
galaxy properties at all epochs, we introduced arbitrary 
power-law scalings of ejecta return time with redshift and 
with halo virial velocity, and we reran our MCMC sampling 
constrained by the abundance data at all four redshifts to- 
gether. This demonstrated that the extended model could 
indeed fit the observed stellar mass and luminosity functions 
at all four redshifts simultaneously. The ejecta return time 
was required to scale inversely with halo virial mass and to 
be independent of redshi ft. This simple result is quite sim- 
ilar to the scaling which Oppenheimer fc Davj l|2008l 'l and 
lOppenheimer et al.l (|201oF found in their cosmological hy- 
drodynamic simulations, where the chosen feedback recipes 
gave reasonable fits to the observed properties of circum- 
galactic gas. 

A comparison of our new best-fit model with the pre- 
ferred model of Gil showed quite substantial changes to the 
efficiency and scaling of a number of the star formation and 
feedback processes, despite the fact that the two models pro- 
duce very similar galaxy populations at z = 0. In the new 
model, the efficiency with which the gas disk makes stars 
has increased substantially (by more than a factor of five), 
supernova energy is used at maximum efficiency to eject gas 
in all but the most massive galaxies, AGN feedback is more 
efficient (by more than a factor of three), the mass-loading 
factor for disk winds depends much more moderately on 
galaxy rotation velocity, and ejecta return times scale more 
moderately with halo virial velocity Kir but oppositely with 
redshift (at fixed V^ir) than in the Gil model. Combined the 
modifications ensure a suppression of the formation of low- 



mass galaxies at early times and a more efficient build-up at 
z < 2, while maintaining the previous evolution of the num- 
ber density of the most massive objects. An increased yield 
leads to better reproduction of the metallicities of massive 
galaxies. 

The fact that models with such substantial differences 
produce similar present-day populations is a dramatic illus- 
tration of the interconnectedness of the many processes in- 
fiuencing galaxy formation. Nevertheless, our MCMC anal- 
ysis demonstrates that there are no significant degeneracies 
when each model is fit to the observational data it was de- 
signed to interpret. All the parameters are required for an 
adequate fit. The physical interest of the fits lies in the spe- 
cific scalings and efficiencies implied, in particular, in check- 
ing whether the efficiencies lie within their plausible ranges. 
In this context it may be worrying that our new model re- 
quires supernova energy to be used so efficiently to eject gas 
from galaxy disks. 

In our new model ejected gas returns to lower mass 
haloes much less rapidly at high redshift than in the Gil 
model. As a result, low mass galaxies grow more slowly, and 
their abundance as a function of stellar mass and luminos- 
ity is a better fit to the observations. At z < 1, however, 
ejecta return to all but the smallest haloes more rapidly 
than in the earlier model, leading to stronger late-time star 
formation and similar z — stellar masses. By construc- 
tion, these chang es eliminate one of the main problems which 
IGuo et al] (|2012l ) identified in their model: its overly early 
production of lower mass galaxies. It is interesting that the 
other two problems mentioned by the authors are also re- 
duced, even though the relevant observations were not used 
as constraints in our MCMC modelling. As a result, in the 
new model: 

• The observed galaxy abundances are correctly repre- 
sented over 5 order of magnitude in mass (7 < log M*/ Mq < 
12) and throughout most of the age of the Universe. 

• The fraction of dwarf galaxies (logAf^/M© < 9.5) 
which are blue, young and actively star-f orming at z = i s 
substantially higher than in the model of lCuo et al.l l|2012l ). 

• The clustering of dwarf galaxies is reduced as result of 
the increased central to satellite ratio at fixed stellar mass, 
and is no longer systematically above that observed. 

Nevertheless, at the lowest masses, star-forming galaxies are 
still older and the passive fraction is still significantly larger 
than observed. This problem may be related to the star- 
formation threshold adopted in the model (see Appendix 
Al), since many of the dwarfs, while containing considerable 
gas, are sitting very close to the threshold. 

The increased heavy element yield in the new model 
produces higher metallicities and redder colours for mas- 
sive galaxies, improving the fit to observations. On the 
other hand, the slope of the stellar mass-metallicity re- 
lation remains shallower than observed, so the stellar 
metallicities of low-mass ga laxies are now overpredicted. 
iHenriques fc ThomasI (|201G| ) noted that the slope of this 
relation can be increased by allowing more efficient tidal 
disruption of satellites. The stellar populations of massive 
galaxies are then less diluted by accretion of low-metallicity 
material from merging satellites. It remains to be seen 
whether this can improve the chemical properties of the 
model without destroying its ability to fit the luminosi- 
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ties of massive galaxies which currently grow substantially 
more through mer ging than through star formation (see 
IGuo fc Whit^l200a i. A more complete treatment of chemi- 
cal evolution, allowing detailed comparison with the obser- 
vational information now available on stellar and gas-phase 
abundances and abundance ratios, will undoubtedly be a 
fruitful extension of the kind of galaxy population modelling 
presented in this paper. 



ACKNOWLEDGEMENTS 

The computations used in this paper were performed on 
the Cosmology Machine supercomputer at the Institute for 
Computational Cosmology, which is part of the DiRAC Fa- 
cility jointly funded by STFC, the Large Facilities Capi- 
tal Fund of BIS, and Durham University. The work of BH, 
SW, RA and GL was supported by Advanced Grant 246797 
"GALFORMOD" from the European Research Council. PT 
was supported by the Science and Technology Facilities 
Council [grant number ST/1000976/1]. GQ acknowledges 
support from the National basic research program of China 
(973 program under grant No. 2009CB24901), the Young 
Researcher Grant of National Astronomical Observatories, 
CAS, the NSFC grants program (No. 11143005), as weU as 
the Partner Group program of the Max Planck Society. VS 
acknowledges support by the Deutsche Forschungsgemein- 
schaft through Transregio 33, "The Dark Universe" . The au- 
thors thank Michele Cirasuolo, Danilo Marchesini and Rob 
Yates for providing their observational data. 



REFERENCES 

Angulo R. E., White S. D. M., 2010, MNRAS, 405, 143 
Baldry I. K., Glazebrook K., Brinkmann J., Ivezic Z., Lup- 

ton R. H., Nichol R. C, Szalay A. S., 2004, ApJ, 600, 

681 

Baldry I. K., Glazebrook K., Driver S. P., 2008, MNRAS, 
388, 945 

Behroozi P. S., Conroy C, Wechsler R. H., 2010, ApJ, 717, 
379 

Bell E. F., Mcintosh D. H., Katz N., Weinberg M. D., 2003, 

ApJ Supp., 149, 289 
Benson A. J., Bower R. G., Frenk C. S., Lacey C. G., Baugh 

C. M., Cole S., 2003, ApJ, 599, 38 
Bertone S., De Lucia G., Thomas P. A., 2007, MNRAS, 

379, 1143 

Bigiel F., Leroy A., Walter F., Brinks E., de Blok W. J. G., 
Madore B., Thornley M. D., 2008, AJ, 136, 2846 

Bower R. G., Benson A. J., Grain R. A., 2012, MNRAS, 
422, 2816 

Bower R. G., McCarthy I. G., Benson A. J., 2008, MNRAS, 
390, 1399 

Bower R. G., Vernon I., Goldstein M., Benson A. J., Lacey 
C. G., Baugh C. M., Cole S., Frenk C. S., 2010, MNRAS, 
407, 2017 

Boylan-Kolchin M., Springel V., White S. D. M., Jenkins 

A., Lemson G., 2009, MNRAS, 398, 1150 
Bruzual G., Chariot S., 2003, MNRAS, 344, 1000 
Caputi K. I., McLure R. J., Dunlop J. S., Cirasuolo M., 

Schael A. M., 2006, MNRAS, 366, 609 



Chabrier G., 2003, PASP, 115, 763 

Chariot S., Bruzual G., 2007, in preparation, 1 

Cirasuolo M., McLure R. J., Dunlop J. S., Almaini O., 

Foucaud S., Simpson C, 2010, MNRAS, 401, 1166 
Cole S., 1991, ApJ, 367, 45 

Cole S., Aragon-Salamanca A., Frenk C. S., Navarro J. F., 

Zepf S. E., 1994, MNRAS, 271, 781 
Cole S., Norberg P., Baugh C. M., et al., 2001, MNRAS, 

326, 255 

Conroy C, Gunn J. E., White M., 2009, ApJ, 699, 486 
Conroy C, Wechsler R. H., Kravtsov A. V., 2006, ApJ, 
647, 201 

Croton D. J., Springel V., White S. D. M., De Lucia G., 
Frenk C. S., Gao L., Jenkins A., Kauffmann G., Navarro 
J. F., Yoshida N., 2006, MNRAS, 365, 11 

De Lucia G., Blaizot J., 2007, MNRAS, 375, 2 

De Lucia G., Borgani S., 2012, ArXiv e-prints 

De Lucia G., Kauffmann G., White S. D. M., 2004, MN- 
RAS, 349, 1101 

De Lucia G., Springel V., White S. D. M., Croton D., Kauff- 
mann G., 2006, MNRAS, 366, 499 

Dommguez Sanchez H., Pozzi F., Gruppioni C, et al., 2011, 
MNRAS, 417, 900 

Drory N., Bender R., Feulner G., Hopp U., Maraston C, 
Snigula J., Hill G. J., 2003, ApJ, 595, 698 

Fontana A., Salimbeni S., Grazian A., et al., 2006, Astron- 
omy and Astrophysics Supplement Series, 459, 745 

Fontanot F., Monaco P., 2010, MNRAS, 405, 705 

Fontanot F., Somerville R. S., Silva L., Monaco P., Skibba 
R., 2009, MNRAS, 392, 553 

Frenk C. S., White S. D. M., Davis M., Efstathiou G., 1988, 
ApJ, 327, 507 

Fu J., Kauffmann G., Li C, Guo Q., 2012, ArXiv e-prints 
Gallazzi A., Chariot S., Brinchmann J., White S. D. M., 

Tremonti C. A., 2005, MNRAS, 362, 41 
Giallongo E., Salimbeni S., Menci N., Zamorani G., 

Fontana A., Dickinson M., Cristiani S., Pozzetti L., 2005, 

ApJ, 622, 116 

Guo Q., White S., Angulo R. E., Henriques B., Lemson G., 
Boylan-Kolchin M., Thomas P., Short C, 2012, ArXiv e- 
prints 

Guo Q., White S., Boylan-Kolchin M., De Lucia G., Kauff- 
mann G., Lemson G., Li C, Springel V., Weinmann S., 
2011, MNRAS, 413, 101 
Guo Q., White S., Li C, Boylan-Kolchin M., 2010, MN- 
RAS, 404, 1111 
Guo Q., White S. D. M., 2008, MNRAS, 384, 2 
Guo Q., White S. D. M., 2009, MNRAS, 396, 39 
Haas M. R., Schaye J., Booth C. M., Dalla Vecchia C, 
Springel V., Theuns T., Wiersma R. P. C, 2012, ArXiv 
e-prints 

Hastings W. K., 1970, Biometrika, 57, 97 

Hatton S., Devriendt J. E. G., Ninin S., Bouchet F. R., 

Guiderdoni B., Vibert D., 2003, MNRAS, 343, 75 
Helly J. C, Cole S., Frenk C. S., Baugh C. M., Benson A., 

Lacey C, 2003, MNRAS, 338, 903 
Henriques B., Maraston C, Monaco P., Fontanot F., Menci 

N., De Lucia G., Tonini C, 2011, MNRAS, 415, 3571 
Henriques B., White S., Lemson G., Thomas P., Guo Q., 

Marleau G.-D., Overzier R., 2012, ArXiv e-prints 
Henriques B. M., Bertone S., Thomas P. A., 2008, MNRAS, 

383, 1649 



Simulations of the galaxy population: implications for galactic winds and the fate of their ejecta 19 



Henriques B. M. B., Thomas P. A., 2010, MNRAS, 403, 
768 

Henriques B. M. B., Thomas P. A., Oliver S., Roseboom 

1., 2009, MNRAS, 396, 535 
llbcrt O., Salvato M., Le Floc'h E., et al., 2010, ApJ, 709, 

644 

Ilbert O., Tresse L., Zucca E., et al., 2005, Astronomy and 

Astrophysics Supplement Series, 439, 863 
Jones D. H., Peterson B. A., Colless M., Saunders W., 2006, 

MNRAS, 369, 25 
Kampakoglou M., Trotta R., Silk J., 2008, MNRAS, 384, 

1414 

Kang X., Jing Y. P., Mo H. J., Borner G., 2005, ApJ, 631, 
21 

Kauffmann G., 1996, MNRAS, 281, 487 

Kauffmann G., Colberg J., Diaferio A., White S., 1999, 
MNRAS, 303, 188 

Kauffmann G., Hachnelt M., 2000, MNRAS, 311, 576 

Kauffmann G., Hcckman T. M., White S. D. M., ct al., 
2003, MNRAS, 341, 33 

Kauffmann G., White S. D. M., Guiderdoni B., 1993, MN- 
RAS, 264, 201 

Kennicutt Jr. R. C, 1998, ApJ, 498, 541 

Kitzbichler M. G., White S. D. M., 2007, MNRAS, 376, 2 

Lacey C., Silk J., 1991, ApJ, 381, 14 

Li C., White S. D. M., 2009, MNRAS, 398, 2177 

Lu Y., Mo H. J., Weinberg M. D., Katz N. S., 2010, ArXiv 
e-prints 

Maraston C., 2005, MNRAS, 362, 799 

Marchesini D., van Dokkum P., Quadri R., et al., 2007, 

ApJ, 656, 42 

Marchesini D., van Dokkum P. G., Forster Schreiber N. M., 
Franx M., Labbe I., Wuyts S., 2009, ApJ, 701, 1765 

Marchesini D., Whitaker K. E., Brammer G., et al., 2010, 
ApJ, 725, 1277 

Martin C. L., 1999, ApJ, 513, 156 

Metropolis N., Rosonbluth A., Rosenbluth M., Teller A., 

Teller E., 1953, J. Chcm. Phys., 21, 1087 
Moster B. P., Naab T., White S. D. M., 2012, ArXiv e- 

prints 

Moster B. P., Somerville R. S., Maulbetsch C., van den 
Bosch F. G., Maccio A. V., Naab T., Oser L., 2010, ApJ, 
710, 903 

Mutch S. J., Poole G. B., Croton D. J., 2012, MNRAS, 
p. 147 

Norbcrg P., Cole S., Baugh C. M., et al., 2002, MNRAS, 
336, 907 

Oppenheimer B. D., Dave R., 2008, MNRAS, 387, 577 
Oppenheimer B. D., Dave R., Keres D., Fardal M., Katz 

N., KoUmcicr J. A., Weinberg D. H., 2010, MNRAS, 406, 

2325 

Perez-Gonzalez P. G., Rieke G. H., Villax V., Baxro G., 
Blaylock M., Egami E., Gallego J., Gil de Paz A., Pascual 
S., Zamorano J., Donley J. L., 2008, ApJ, 675, 234 

Poh F., Giallongo E., Fontana A., et al., 2003, ApJ, 593, 
LI 

Pozzetti L., Bolzonella M., Zucca E., et al., 2010, Astron- 
omy and Astrophysics Supplement Series, 523, A13 

Pozzetti L., Cimatti A., Zamorani G., Daddi E., Menci N., 
Fontana A., Renzini A., Mignoli M., Poll F., Saracco P., 
Broadhurst T., Cristiani S., D'Odorico S., Giallongo E., 
Gilmozzi R., 2003, Astronomy and Astrophysics Supple- 



ment Series, 402, 837 
Press W. H., Schechter P., 1974, ApJ, 187, 425 
Reddick R. M., Wechsler R. H., Tinker J. L., Behroozi P. S., 

2012, ArXiv e-prints 
Roukema B. F., Quinn P. J., Peterson B. A., Rocca- 

Volmerange B., 1997, MNRAS, 292, 835 
SaUm S., Rich R. M., Chariot S., et al., 2007, ApJ Supp., 

173, 267 

Salimbeni S., Giallongo E., Menci N., et al., 2008, Astron- 
omy and Astrophysics Supplement Series, 477, 763 

Saracco P., Fiano A., Chincarini G., et al., 2006, MNRAS, 
367, 349 

Somerville R. S., Gilmore R. C, Primack J. R., Dominguez 

A., 2011, ArXiv e-prints 
Somerville R. S., Primack J. R., 1999, MNRAS, 310, 1087 
Somerville R. S., Primack J. R., Faber S. M., 2001, MN- 
RAS, 320, 504 

Springel V., White S. D. M., Jenkins A., et al., 2005, Nat., 

435, 629 

Springel V., White S. D. M., Tormen G., Kauffmann G., 

2001, MNRAS, 328, 726 
Steidel C. C, Erb D. K., Shapley A. E., Pettini M., Reddy 

N., Bogosavljevic M., Rudie G. C, Rakic O., 2010, ApJ, 

717, 289 

Strickland D. K., Heckman T. M., 2009, ApJ, 697, 2030 
Tonini C, Maraston C, Thomas D., Devriendt J., Silk J., 

2010, MNRAS, 403, 1749 
Vale A., Ostriker J. P., 2004, MNRAS, 353, 189 
Wang L., Weinmann S. M., Neistein E., 2012, MNRAS, 

p. 2472 

Weinmann S. M., Kauffmann G., van den Bosch F. C, 

PasquaU A., Mcintosh D. H., Mo H., Yang X., Guo Y., 

2009, MNRAS, 394, 1213 
Weinmann S. M., Pasquali A., Oppenheimer B. D., Finla- 

tor K., Mendel J. T., Grain R. A., Maccio A. V., 2012, 

ArXiv c-prints 

Weinmann S. M., van den Bosch F. C., Yang X., Mo H. J., 

2006, MNRAS, 366, 2 
White S. D. M., 1989, in Frenk C. S., sEllis R. S., Shanks T., 

Heavens A. R., Peacock J. A., eds, NATO ASIC Proc. 264: 

The Epoch of Galaxy Formation Observable signatures of 

young galaxies, p. 15 
White S. D. M., Frenk C. S., 1991, ApJ, 379, 52 
Willmer G. N. A., Faber S. M., Koo D. G., et al., 2006, 

ApJ, 647, 853 

Zucca E., Bardelli S., Bolzonella M., ct al., 2009, Astron- 
omy and Astrophysics Supplement Series, 508, 1217 



APPENDIX A: PHYSICAL PRESCRIPTIONS 

Our semi-analytic model is built on top of high-resolution 
N-body simulations that provide a detailed model for the 
evolution of the dark matter distribution. At each time, a 
friends-of-friends algorithm is used to define a set of haloes, 
each of which is then further divided into a set of subhaloes, 
disjoint, self-bound particle clumps with a single gravita- 
tional potential minimum where a galaxy is assumed to 
form. Each subhalo at time i is linked to a unique descen- 
dant at time i+1 by following its constituent particles. These 
links produce merger trees which specify the full assembly 
history of every z = subhalo, and are the data structures 
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on which the semi-analytic model operates to follow the for- 
mation and evolution of the galaxy that each subhalo con- 
tains. As subhaloes form and grow through accretion, new 
dark matter is assumed to bring the global mean baryon 
fraction with it. These baryons are transferred among the 
various baryonic components associated with the subhalo 
(hot gas atmosphere, cold disk gas, ejected gas, disk and 
bulge stars) by the semi-analytic model. In very low-mass 
haloes, gas infall is suppressed by UV heating once the Uni- 
verse is ionized. At higher mass, accretion of newly acquired 
gas occurs in two modes: at high redshift and for subhaloes 
with mass below ~ 10^^'^ Mq, cooling is rapid and infalling 
gas accretes directly onto the central galaxy; at later times 
and in more massive haloes, the gas shocks to form a quasi- 
static atmosphere from which it gradually accretes onto the 
galaxy through a cooling flow. Although this spherical model 
does not allow filamentary inflow to coexist with a hot at- 
mosphere, it produces galaxy accretion rates as a function 
of halo mass and redshift which agree reasonably with those 
derived from detailed gas dynamical simulations. In prac- 
tice, the situation is considerably complicated by the strong 
galactic winds required in all realistic models. 

Star formation occurs in cold disk gas, either quiescently 
or in merger-induced bursts. A fraction of the stars formed 
(those with masses above 8M0) is assumed to explode as 
type II supernovae. This mass is instantaneously returned to 
the cold phase and the energy released is used to reheat gas 
from the cold to the hot phase. Left-over energy is used to 
eject gas from the hot phase into an external reservoir. This 
ejected gas can later be reincorporated in the hot phase, and 
it is this aspect of the Gil model which we modify in this 
paper, as described in Section U The evolution of massive 
stars also produces new metals that are deposited in the 
cold phase. Each newly formed galaxy is assigned a central 
seed black hole that then grows through accretion of cold 
gas during mergers and through quiescent accretion of hot 
gas. When the black hole reaches a sufficiently large mass, 
this quiescent accretion is assumed to power a radio source 
which offsets cooling in the hot gas, thereby quenching star 
formation. 

In the following subsections, we present the equations 
describing those aspects of the semi-analytic modelling for 
which the parameters have been allowed to vary in our 
MCMC analysis. These parameters are qsf, ^agn, /bh, e, 
Keheat, fil , Tj, Veject, (^2, l' ^nd Zyidd • For s, moi e complete 
descri ption of the model we refer the reader to IGuo et all 
l|201ll ). 



Al Star formation 

Over most of the lifetime of a galaxy, star formation happens 
in a quiescent mode through the fragmentation of cold gas 
that reaches a sufficiently high density. The star formation 
rate in this mode is modeled as 



(mcoid — merit) 



(Al) 



^dyn,disk 

where mcoid is the mass of cold gas and tdyn,disk = J'disk/V'max 
is the dynamical time of the disk. Note that the mass locked 
up in stars over a time interval dt is (1 — R)rh* dt, the rest 
being instantaneously returned to the cold disk gas. (R = 
0.43 here denotes the recycled fraction as determined from 



the IChabrie3 l|2003l ) initial mass function.) The threshold 
gas mass above which star formation is assumed to occur, 
merit, is derived f rom a t hreshold surface density at each 
radius as given by iKauffmanni (jlQQQ ) : 



Ecrit(i?) = 120 



Ki 



^200 kms-i 
leading to the estimate 



merit = 3.8 X 10' 



Ki 



200 kms" 



fdisk 

lOkpc 



0PC 



Mr: 



(A2) 



(A3) 



where rdisk is obtained from the specific angular momentum 
of the cold gas disk which in turn is calculated by summing 
the contribut ions to the angu lar momentum vector from all 
infalling gas (|Guo et al.ll201ll '). 

During mergers it is assumed that the cold gas is dis- 
turbed significantly, allowing a large fraction to reach the 
densities necessary for fragmentation. Th is phenomenon is 
modeled (following ISomerville et al.] 1200 ll ) as a higher effi- 
ciency for star formation during merger events: 



: 0.56 



'^central 



(A4) 



The burst parameters are kept fixed in the MCMC sampling; 
of the parameters in this section, only qsf is allowed to vary. 



A2 Supernova feedback 

Massive stars release large amounts of energy into the sur- 
rounding medium both through their stellar winds and when 
they explode as supernovae. This energy can reheat cold 
disk gas and inject it into the hot atmosphere, and it 
may also eject hot gas from the virialised regions of the 
galaxy /subhalo system. The ejecta may then fall back into 
the system at some later time. This feedback-induced cycle 
is treated by the semi-analytic model as a three stage process 
involving gas reheating, gas ejection and the reincorporation 
of ejected gas. 

The mass of gas reheated by star formation is assumed 
to be directly proportional to the amount of stars formed: 



where 



Amr( 



Cdisk = e X 



0.5 



ediskAm* 



(A5) 



(A6) 



Any supernova energy left over after reheating the disk gas 
is used to eject material from the hot atmosphere into an ex- 
ternal reservoir. The total energy input from SN is assumed 
to be: 



1 2 
A£;sN = eiialo X -Am*VsN, 



where 



eiialo 



0.5 + 



Ki 



(A7) 



(A8) 



The ejected mass is then derived from the residual energy 
after reheating: 



-AMejeetKir = A£'s^ 



iAMrehoatKi 



(A9) 
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As described in IGuo et al] (|201ll ). if this equation gives 
SMcjcc < 0, the reheated mass is assumed to saturate at 
5Mrohcat = SEsn/ (|Kdr)- A-lso, Ehaio is forced never to ex- 
ceed unity. Both conditions ensure that the total amount 
of energy used in feedback is limited to that available from 
supernovae. Material ejected from the hot atmosphere can 
be reincorporated at later times according to the models 
described in Section |4l 



A3 AGN feedback 

Three processes are assumed to control the growth of mas- 
sive central black holes and their effects on their envi- 
ronment. Cold gas accretion during mergers fuels "quasar 
mode" growth which is the principal route by which the 
black hole population gains mass. Black holes are also as- 
sumed to accrete quiescently from hot gas atmospheres in a 
"radio mode" which does not add significantly to their mass 
but pumps energy into the hot gas, thus counteracting its 
cooling and quenching the supply of new cold gas for star 
formation. Finally central black holes are assumed to merge 
when their parent galaxies merge. 

The amount of gas accreted in quasar mode is taken to 
be proportional to the mass ratio between the two merging 
galaxies and to the their total amount of cold gas: 



AmBH,Q = 



1-f- (280kms-VKir)2 



(AlO) 



The rate of radio mode accretion from the hot gas is 
taken to be 



niBH.H, 



fcAGN 



108 M, 



./hot 

oT 



200 kms" 



(All) 

where ttibh is the black hole mass and /hot is the mass 
fraction of hot gas in the halo. This accretion is assumed 
to pump energy back into the hot gas through mechanical 
heating at a substantial fraction of the Eddington rate: 



LbH = CmBH.RC 



(A12) 



where c is the speed of light and ( is set to 0.1 (which in 
combination with Aagn determines the heating rate). This 
energy is used to offset the cooling of gas from the hot at- 
mosphere onto the cold disk, resulting in a modified cooling 
rate. 



'^cool ^^cool 



' 

2 vir 



(A13) 



which is required to be non-negative. 



A4 Metal enrichment 

The metals produced by the stars are assumed to follow the 
same processing cycle as the gas. For every solar mass of 
stars formed, a mass .^yiew Mq of heavy elements is returned 
instantaneously to the cold phase, where .Zyieid is known as 
the yield and is the only chemical enrichment parameter 
which we allow to vary in our MCMC analysis. These heavy 
elements then follow the gas flow. They can be locked in 
new long-lived stars, or they can be reheated into the hot 
phase, ejected into the external reservoir, reincorporated in 
the hot phase, and returned to the cold phase by cooling. 



Each of the three gas phases is assumed to be chemically 
homogeneous at each time, i.e. mixing of heavy elements 
within each phase is assumed to be instantaneous. 



APPENDIX B: MONTE CARLO MARKOV 
CHAIN ANALYSIS 



As discussed in iHenrigues et all (|2009h . implementing the 
MCMC approach for cosmological galaxy formation mod- 
elling on simulations like the MS and MS-II raises signifi- 
cant challenges with respect to computational power. In a 
brute force approach, the semi-analytic model would need 
to follow the formation and evolution of up to 20 million 
galaxies throughout the simulation volume for every pa- 
rameter set proposed at a MCMC step. Population distri- 
butions of interest could then be compared with the ob- 
servational constraints to compute a likelihood. The size 
of the calculation and the required nurnbcr of steps make 
this a very costly propositi on. IHenrigues et al.l ((2009|) and 
IHenrigues fc Thomas! (|2010l ) therefore restricted themselves 
to a small but approximately representative sub-volume, 
just 1/512 of the full Millennium Simulation. While reason- 
ably efficient, their approach did not optimise the fidelity of 
the representation of the full model for given computational 
resources. Here, we adopt a scheme which defines a repre- 
sentative set of subhalo merger trees such that abundance 
statistics such as those used as constraints in this paper 
can be modelled with controlled and approximately uniform 
precision and at minimum computational cost across the 
full range of stellar masses (or luminosities) considered. It 
is also important to ensure that we have a properly repre- 
sentative model at all the redshifts considered. Our scheme 
allows merger trees from the MS and MS-II to be used si- 
multaneously, significantly increasing the dynamical range 
in galaxy mass/luminosity which can be used to constrain 
the model. 

Consider the problem of estimating the galaxy luminos- 
ity function $(!/) of a specific model, which we take to be 
the function obtained when the model is applied to the full 
MS and MS-II simulations. We want to find a small subset 
of the halo merger trees which reproduces $(L) to some de- 
sired accuracy over some luminosity range of interest. Let 
us assume that our simulated volume contains A'^ haloes, 
partitioned into a total of / mass bins labelled by i. Then 



N - 



(Bl) 



where Ni is the number of haloes in the i'th mass bin. Fur- 
ther, assume that we wish to estimate a luminosity function 
defined on a set of J luminosity bins labelled by the index 
j. We can write the luminosity function of the simulated 
volume as a whole as 



(B2) 



where 4>ij is the average number of galaxies in luminosity 
bin j for a randomly selected simulation halo in mass bin i. 
We seek a set of numbers Ui such that choosing n; haloes at 
random among the A''^ in the ith bin is sufficient to estimate 
the luminosity to function to some preassigned accuracy, for 
example, ±5%. 
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For a specific subsample of Ui haloes in eacli bin i, a 
straightforward estimate of $j is 



rii 

i=i fe=i 



(B3) 



where Sijk is the number of galaxies in the j'th luminosity 
bin for the fc'th of the rii haloes in mass bin i. For different 
k the Sijk are independent, identically distributed random 
variables with non-negative integer values and mean (j>ij- 
We will assume that they are Poisson distributed, but we 
note that this, in fact, overestimates their variance because 
each halo contains one and only one central galaxy and such 
central galaxies dominate the bright en d of the luminosi ty 
function (see, for example, Fig. 24 of IGuo et al.l (|201ll )'). 
With this Poisson assumption we have 



and 



var <I>, 



E 



rii 



(B4) 



(B5) 



If we wish the rms uncertainty of our luminosity function 
estimate to be a fraction F of its true value in each of the 
luminosity bins, we then require 



1, J. 



(B6) 



For a specific galaxy formation model implemented on 
the Millennium simulations, the Gil model, for example, we 
know the Ni, the $j and the so equations IB6I are a set 
of linear constraint equations on F^rii. In the specific case 
I = J , the matrix <j)ij is square and can be inverted to get 
an explicit solution for the rii, 



1,/. 



(B7) 



If 7 > J and if for every i at least one of the 4>ij is non-zero, 
the solution space of equations IB6I is (7 — J)-dimensional 
and one can normally pick a set of rii values satisfying the 
requirements Q ^ rii ^ Ni and minimising some computa- 
tional cost function, for example, the total number of trees 
chosen, X^i*^*' or the total number of galaxies modeled, 
'^ij riiiftij ■ For our analysis we choose F — 0.05 to ensure 
that the uncertainty of the luminosity and stellar mass func- 
tions estimated for our models are smaller than the obser- 
vational uncertainties on the data we compare them with. 
In total, fewer than 1% of all Millennium trees are needed 
to obtain estimates consistent with the full set to this level 
of accuracy. In practice, we use Millennium-II trees to cal- 
culate abundances for galaxies with logM^/M© < 9.5 and 
Millennium trees to calculate abundances for more massive 
galaxies. 

A representative and sufficiently large sample of haloes 
is needed at all redshifts at which the model will be com- 
pared with observation. Starting at z = 0, we select the "op- 
timum" number rii of FoF haloes at random in each mass 
bin. When weighted by Ni/rii, these provide a representa- 
tive sample of the full cosmic halo population. Hence, when 
combined with the same weights, the galaxies they contain 
at 2: = reproduce the present-day luminosity and stellar 



mass functions (equation [B3|) and the galaxies contained by 
their progenitors a.t z — 1 (say) reproduce these same func- 
tions at that time. It is unclear, however, whether there will 
be enough progenitor haloes in each mass bin to satisfy our 
S/N requirements at z = 1. We check explicitly whether this 
is the case. If not, we select sufficient additional z = 1 haloes 
to restore the required precision. Their formation trees are 
then followed back together with those of the original haloes 
to z — 2 and the process is repeated. Note that the repre- 
sentative halo set and the associated trees and weights are 
determined once using a fiducial model. Galaxy formation is 
then followed sequentially from high to low redshift in this 
predefined set of trees for every parameter set in our MCMC 
analysis. We have verified explicitly that these procedures 
produce luminosity and stellar mass function estimates at 
all redshifts which reproduce the functions for the full simu- 
lations at least as accurately as inferred from the simplified 
analysis above. 



APPENDIX C: OBSERVATIONAL ERRORS 

In this Appendix we discuss in more detail our method 
for combining individual datasets to construct observational 
constraints with realistic error bars for each of the popula- 
tion properties used in the MCMC sampling presented in 
section [S] This is a crucial aspect of the MCMC analysis 
as the allowed regions in parameter space and the good- 
ness of a fit to multiple properties depend strongly on the 
observational uncertainties. While errors based on counting 
statistics and cosmic variance are given in most observa- 
tional studies it is much harder to get a realistic assessment 
of systematic uncertainties. These can show up as apparent 
inconsistencies between different determinations of the same 
population property, and failing to account for them can 
jeopardise a meaningful comparison with theoretical predic- 
tions. 

Here we follo w Henriques et al.l l|2009l ) and 
iHenrioues fc Thomas! |2010'), using multiple determi- 
nations of each observational property and taking the 
scatter among them to indicate likely systematic uncer- 
tainties. For each observational property, at each redshift, 
we re-bin the individual estimates into a fixed set of broad 
bins. In each bin we then discount any determination with 
very large error bars and take a straight average of the rest 
as the constraint and their maximum and minimum values 
as its ±1(7 uncertainty. By not weighting the averages we 
attempt to account for the fact that systematic errors can 
affect large and small surveys in s imilar ways. This can be 
seen ^ for example, in the w ork of iMarchesini et al.l (|2009l ) 



and IMarchesini et all 
and Cirasuolo et al.l 



201C) for the stellar mass functions 



20101 ) for the Tf-band luminosity 



function. Counting errors are swamped in their overall 
error budget by combined uncertainties from SED fitting 
assumptions, photometric redshift errors, photometric 
errors, extrapolations from the observed photometry and 
cosmic variance. 

Finally we harmonise the sizes of error bars to avoid 
dramatic changes between bins, for example, where there 
is a change in the number of surveys included. Typically, 
we re-size the error bar to the value of the next consecu- 
tive bin towards the knee of the function. More surveys are 
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Figure CI. Evolution of the stellar mass fu nction from z = 3 to z = as in Fig.[4l excep t with the data points for the individual under- 
lying surveys a.lso shown. These surveys a r e|Baldrv et al] ||200^) a nd iLi fc Whit"T ll2009|) at z = 0, and G OODS-MUSIC ( Fontana ct aL 
2006 ). Spitzcr ('Porez-Gonzaloz ct al. 2008), Marchcsini ct al. (2009), Spitzcr-COSMOS (" llbert et al.l2o"loh . NEWFIRM ( Marchcsi ni et al. 



201Ct) . zCO SMOS (Pozzctti ct al. 2010) and COSMOS (Domfrisucz Sanchez ct al. 201Ji) at higher redshift. All mas s esti mates at 2 > , 



except [Pomfnguez Sanc hez et al.l ll201 lj) have been shifted by -0.14 dcx to convert from iBruzual &: Charlo"3 l|2003l l to [Marastonl l|2005l ) 
stellar populations llDomfnguez Sanchez et al.ll201ll l. The z = results of iLi fc Whit3 ||2009|'| are repeated at all redshifts as a black 
dotted line. 



normally available towards the knee and their counting er- 
rors are usually smaller there, making systematics estimates 
more robust in this region. Clearly our procedure is an art 
rather than a quantitative science, but we believe it gives 
realistic estimates of the overall level of uncertainty given 
the information available. 

Our adopted constraints are shown together with the 
individual datasets on which they are based in Figure ICll 
for the stellar mass functions, in Figure [C2l for the K-hand 
luminosity functions, and in Figure [C3l for the B-band lumi- 
nosity functions. The constraints are shown as blue dots with 
error bars while other data points represent observational es- 
timates from the individual surveys. Theoretical predictions 
are shown as red lines. For the stellar mass function, the 
most massive bin at z = and the most massive and least 
massive bins at z — 1 have been re-sized according to the 
method described in the last paragraph. The same was done 
for the faintest bin in the z — 3 -B-band luminosity function. 

We emphasise that despite our attempt to estimate un- 
certainties from the data, our method still involves arbitrary 
judgements about the quality of data and the way in which 
different surveys should be combined. As a result, formal 
levels of agreement between theory and observation should 
be treated with caution. 
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Figure C2. Evolution of the rest-frame X-band luminosity function from 2: = 3 to 2 = as in F ig. [S] except that data points for 
the individual su rveys are shown. These are three different 2MASS-base d determinations at 2 = jCoIe e t al. 2001; Bell ct al. 20(3; 
I Jones et aLll2006t). and dete rminations at high er re dshift from MUNICS l|Drorv et al.ll2003l). the K20 survey UPozzetti et al.ll2003!l . the 
HDF-S llSaracco et al.ll2006l') . GOODS-CDFS (Caputi et al. 2006}, and the UKIDSS-UDF jCirasuolo et ahlboid) . The 2 = results of 
I Jones et al .1 l|2006l 'l are repeated at all redshifts as a black dotted line. 
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Figure C3. Evolution of tlie rest-frame _B-band luminosity function from z = S to z = as in Fig. [6] e xcept that data po ints for the 
individual surveys are shown At 2 = thes e are the 2DFGRS jN orbcrg ct al.ll2002h and the 6DFGR S l ljones et al.ll2006l'). At highe r 
redshift, they are for HDF-S ijPoli et al."2003'), HDF-N jCiallongo et al.. 2005). VVDS (Ilbcrt et al."2005"), DEEP2 ('Willmcr et al.'''200e|), 
GOODS-MUSYC p lus FIRES (Marchcsi ni ct al. 20o3), GOODS-MUSYC (Salimbcni ct al. .200&') . and zCOSMOS dZucca ct al..,2009ii . 
The 2 = results of ljones et al .1 I2OO6I ) are repeated at all redshifts as a black dotted line. 



